Txtbot

Simple Article Extraction API

Accurate Content Extraction

Our technology identifies and extracts the main content while filtering out unnecessary page elements like ads and navigation.

  • Perfect for research and analysis
  • Maintains original formatting
  • Works across most websites
Content analysis illustration

Key Features

Fast Processing

Get article content in seconds with our efficient extraction.

Save Time

Automate content collection instead of manual copying.

Affordable

Simple pricing at just 5 cents per page.

API Documentation

Our API endpoint is available at: https://txtbot.xyz/api/v1/page?url=[url]

Simply replace [url] with the URL-encoded address of the page you want to extract content from.

The API returns a JSON response with the extracted content and metadata.

curl -X GET "https://txtbot.xyz/api/v1/page?url=https%3A%2F%2Fexample.com%2Farticle" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY"
// Using fetch API
const url = 'https://example.com/article';
const encodedUrl = encodeURIComponent(url);
const apiUrl = `https://txtbot.xyz/api/v1/page?url=${encodedUrl}`;

fetch(apiUrl, {
  method: 'GET',
  headers: {
    'Accept': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  }
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
import requests
from urllib.parse import quote

url = "https://example.com/article"
encoded_url = quote(url, safe='')
api_url = f"https://txtbot.xyz/api/v1/page?url={encoded_url}"

headers = {
    "Accept": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.get(api_url, headers=headers)
data = response.json()
print(data)
 [
        'method' => 'GET',
        'header' => 
            "Accept: application/json\r\n" .
            "Authorization: Bearer YOUR_API_KEY\r\n"
    ]
];

$context = stream_context_create($options);
$response = file_get_contents($apiUrl, false, $context);
$data = json_decode($response, true);

print_r($data);
?>
require 'uri'
require 'net/http'
require 'json'

url = "https://example.com/article"
encoded_url = URI.encode_www_form_component(url)
api_url = "https://txtbot.xyz/api/v1/page?url=#{encoded_url}"

uri = URI(api_url)
req = Net::HTTP::Get.new(uri)
req['Accept'] = 'application/json'
req['Authorization'] = 'Bearer YOUR_API_KEY'

res = Net::HTTP.start(uri.hostname, uri.port) {|http|
  http.request(req)
}

data = JSON.parse(res.body)
puts data

Example Response

{
  "success": true,
  "data": {
    "title": "Example Article Title",
    "author": "John Doe",
    "published_date": "2023-05-15T10:30:00Z",
    "content": "This is the full extracted article content with all the main text from the webpage. It includes paragraphs, headings, and other relevant content while excluding ads, navigation, and other non-essential elements.",
    "word_count": 1245,
    "reading_time": 6,
    "source_url": "https://example.com/article",
    "extracted_at": "2023-10-20T14:45:22Z"
  },
  "metadata": {
    "version": "1.0",
    "processing_time": 1.24
  }
}

success: Boolean indicating if the request was successful

data.title: The title of the extracted article

data.author: Author of the article (if detectable)

data.published_date: Publication date of the article

data.content: The main content of the article with clean formatting

data.word_count: Total number of words in the extracted content

data.reading_time: Estimated reading time in minutes

data.source_url: The original URL that was processed

data.extracted_at: Timestamp when the extraction occurred

metadata.version: API version used for the extraction

metadata.processing_time: Time taken to process the request (in seconds)

Frequently Asked Questions

How does the extraction process work?

+

Simply provide the URL of the page you want to extract, and our API will return the clean article text in a structured format, removing all ads and unnecessary elements.

Is there a limit to how many pages I can extract?

+

No, you can extract as many pages as you need. We charge 5 cents per page processed.

How can I get API access?

+

Contact us to request API access. We're currently onboarding select users during our beta period.

What payment methods do you accept?

+

We accept credit cards and PayPal. Enterprise customers can request invoice billing.

Ready to start extracting content?

Join our beta program and get started today

Request Access