How to fetch a URL → curl
How to fetch a URL → curl
Table of Contents
1. [Introduction](#introduction)
2. [Prerequisites](#prerequisites)
3. [Basic curl Syntax](#basic-curl-syntax)
4. [Simple URL Fetching](#simple-url-fetching)
5. [Common curl Options](#common-curl-options)
6. [Advanced Usage Examples](#advanced-usage-examples)
7. [Working with Different HTTP Methods](#working-with-different-http-methods)
8. [Handling Authentication](#handling-authentication)
9. [Working with Headers and Cookies](#working-with-headers-and-cookies)
10. [File Downloads and Uploads](#file-downloads-and-uploads)
11. [Troubleshooting Common Issues](#troubleshooting-common-issues)
12. [Best Practices](#best-practices)
13. [Performance Optimization](#performance-optimization)
14. [Conclusion](#conclusion)
Introduction
curl (Client URL) is a powerful command-line tool and library for transferring data with URLs. It supports numerous protocols including HTTP, HTTPS, FTP, FTPS, and many others. Whether you're a developer testing APIs, a system administrator monitoring web services, or someone who needs to download files programmatically, curl is an indispensable tool in your arsenal.
This comprehensive guide will teach you everything you need to know about fetching URLs using curl, from basic usage to advanced techniques. You'll learn how to make HTTP requests, handle different response types, work with authentication, and troubleshoot common issues that arise when working with web services.
By the end of this article, you'll be proficient in using curl for various web-related tasks and understand how to leverage its powerful features for your specific needs.
Prerequisites
Before diving into curl usage, ensure you have the following:
System Requirements
- Operating System: Linux, macOS, Windows, or Unix-based system
- curl Installation: Most Unix-like systems come with curl pre-installed
- Terminal Access: Command-line interface access
- Basic Command-Line Knowledge: Understanding of terminal/command prompt usage
Checking curl Installation
To verify curl is installed on your system, run:
```bash
curl --version
```
This command displays curl version information and supported protocols. If curl isn't installed, you can install it using your system's package manager:
Ubuntu/Debian:
```bash
sudo apt-get update
sudo apt-get install curl
```
CentOS/RHEL/Fedora:
```bash
sudo yum install curl
or for newer versions
sudo dnf install curl
```
macOS (using Homebrew):
```bash
brew install curl
```
Windows:
- Download from the official curl website
- Use Windows Subsystem for Linux (WSL)
- Install via package managers like Chocolatey
Basic curl Syntax
The fundamental curl syntax follows this pattern:
```bash
curl [options] [URL]
```
Essential Components
- curl: The command itself
- [options]: Various flags and parameters to modify behavior
- [URL]: The target URL to fetch
Simple Example
```bash
curl https://www.example.com
```
This basic command fetches the content from the specified URL and displays it in your terminal.
Simple URL Fetching
Let's start with the most basic curl operations to fetch URL content.
Fetching a Web Page
To retrieve the HTML content of a web page:
```bash
curl https://httpbin.org/html
```
This command downloads and displays the HTML content directly in your terminal. The output includes the complete HTML structure of the page.
Fetching JSON Data
When working with APIs that return JSON:
```bash
curl https://httpbin.org/json
```
This retrieves JSON data from the endpoint and displays it in your terminal. The raw JSON output can be piped to other tools for processing.
Following Redirects
Many URLs redirect to other locations. Use the `-L` flag to follow redirects automatically:
```bash
curl -L https://bit.ly/2XYZ123
```
Without the `-L` flag, curl would stop at the redirect response and not fetch the final destination.
Silent Mode
To suppress progress information and only show the content:
```bash
curl -s https://api.github.com/users/octocat
```
The `-s` (silent) flag eliminates the progress meter and error messages, showing only the response content.
Common curl Options
Understanding curl's extensive options is crucial for effective usage. Here are the most commonly used flags:
Output Options
Save to File (`-o` and `-O`)
```bash
Save with custom filename
curl -o myfile.html https://www.example.com
Save with original filename
curl -O https://www.example.com/file.pdf
```
Append to File
```bash
curl https://api.example.com/data >> accumulated_data.json
```
Verbose Output
Show Detailed Information (`-v`)
```bash
curl -v https://httpbin.org/get
```
This displays detailed information about the request and response, including headers and SSL handshake details.
Show Only Headers (`-I`)
```bash
curl -I https://www.example.com
```
The `-I` flag performs a HEAD request, returning only the HTTP headers without the body content.
Request Modification
Custom User Agent (`-A`)
```bash
curl -A "MyApp/1.0" https://httpbin.org/user-agent
```
Custom Headers (`-H`)
```bash
curl -H "Accept: application/json" -H "Authorization: Bearer token123" https://api.example.com/data
```
Request Timeout (`--connect-timeout` and `--max-time`)
```bash
curl --connect-timeout 10 --max-time 30 https://slow-api.example.com
```
Advanced Usage Examples
Working with Query Parameters
When dealing with URLs containing query parameters, proper encoding is essential:
```bash
Simple query parameters
curl "https://httpbin.org/get?param1=value1¶m2=value2"
URL encoding special characters
curl "https://httpbin.org/get?search=hello%20world&category=tech"
Using curl's built-in URL encoding
curl -G -d "search=hello world" -d "category=tech" https://httpbin.org/get
```
Handling Multiple URLs
curl can process multiple URLs in a single command:
```bash
Fetch multiple URLs sequentially
curl https://httpbin.org/get https://httpbin.org/ip https://httpbin.org/user-agent
Use URL globbing for patterns
curl https://example.com/file[1-5].txt
Download files with different extensions
curl https://example.com/file.{jpg,png,gif}
```
Rate Limiting and Delays
To avoid overwhelming servers, implement delays between requests:
```bash
Add 2-second delay between requests
curl --rate 0.5/s https://api.example.com/endpoint[1-10]
Manual delay using sleep in scripts
for i in {1..10}; do
curl https://api.example.com/endpoint/$i
sleep 2
done
```
Working with Different HTTP Methods
curl supports all standard HTTP methods for comprehensive API interaction.
GET Requests (Default)
```bash
Explicit GET request
curl -X GET https://httpbin.org/get
GET with query parameters
curl -X GET "https://httpbin.org/get?key=value"
```
POST Requests
Sending Form Data
```bash
URL-encoded form data
curl -X POST -d "username=john&password=secret" https://httpbin.org/post
From file
curl -X POST -d @form_data.txt https://httpbin.org/post
```
Sending JSON Data
```bash
curl -X POST \
-H "Content-Type: application/json" \
-d '{"name":"John","email":"john@example.com"}' \
https://httpbin.org/post
```
Sending JSON from File
```bash
curl -X POST \
-H "Content-Type: application/json" \
-d @user_data.json \
https://api.example.com/users
```
PUT Requests
```bash
Update resource with JSON
curl -X PUT \
-H "Content-Type: application/json" \
-d '{"name":"Updated Name","status":"active"}' \
https://api.example.com/users/123
```
DELETE Requests
```bash
Simple DELETE request
curl -X DELETE https://api.example.com/users/123
DELETE with authentication
curl -X DELETE \
-H "Authorization: Bearer your_token_here" \
https://api.example.com/users/123
```
PATCH Requests
```bash
Partial update with PATCH
curl -X PATCH \
-H "Content-Type: application/json" \
-d '{"status":"inactive"}' \
https://api.example.com/users/123
```
Handling Authentication
curl supports various authentication methods for accessing protected resources.
Basic Authentication
Username and Password
```bash
Interactive password prompt
curl -u username https://httpbin.org/basic-auth/username/password
Inline credentials (less secure)
curl -u username:password https://httpbin.org/basic-auth/username/password
From netrc file
curl -n https://protected.example.com/data
```
Creating .netrc File
```bash
~/.netrc file content
machine api.example.com
login your_username
password your_password
```
Bearer Token Authentication
```bash
API token in header
curl -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." \
https://api.example.com/protected
API key as query parameter
curl "https://api.example.com/data?api_key=your_api_key_here"
```
OAuth 2.0 Authentication
```bash
Using access token
curl -H "Authorization: Bearer ACCESS_TOKEN" \
https://api.github.com/user
OAuth with custom headers
curl -H "Authorization: OAuth oauth_consumer_key=key,oauth_token=token..." \
https://api.twitter.com/1.1/statuses/home_timeline.json
```
Client Certificate Authentication
```bash
Using client certificate
curl --cert client.pem --key client-key.pem https://secure.example.com
With certificate password
curl --cert client.p12:password https://secure.example.com
```
Working with Headers and Cookies
Custom Headers
Single Header
```bash
curl -H "Accept: application/json" https://api.example.com/data
```
Multiple Headers
```bash
curl -H "Accept: application/json" \
-H "User-Agent: MyApp/1.0" \
-H "X-API-Key: secret123" \
https://api.example.com/data
```
Removing Default Headers
```bash
Remove User-Agent header
curl -H "User-Agent:" https://httpbin.org/headers
```
Cookie Management
Sending Cookies
```bash
Single cookie
curl -b "session_id=abc123" https://example.com/dashboard
Multiple cookies
curl -b "session_id=abc123; preference=dark_mode" https://example.com/settings
From cookie file
curl -b cookies.txt https://example.com/protected
```
Saving Cookies
```bash
Save cookies to file
curl -c cookies.txt https://example.com/login
Load and save cookies
curl -b cookies.txt -c cookies.txt https://example.com/dashboard
```
Cookie Jar Management
```bash
Automatic cookie handling
curl -b cookie-jar.txt -c cookie-jar.txt https://example.com/step1
curl -b cookie-jar.txt -c cookie-jar.txt https://example.com/step2
```
Response Header Analysis
Show Response Headers
```bash
Include headers in output
curl -i https://httpbin.org/get
Only headers (HEAD request)
curl -I https://httpbin.org/get
Dump headers to file
curl -D headers.txt https://example.com
```
File Downloads and Uploads
Downloading Files
Simple Download
```bash
Download and save with original name
curl -O https://example.com/file.zip
Download with custom name
curl -o my_file.zip https://example.com/file.zip
```
Resume Interrupted Downloads
```bash
Resume partial download
curl -C - -O https://example.com/large_file.zip
```
Download with Progress Bar
```bash
Show progress bar instead of progress meter
curl -# -O https://example.com/file.zip
```
Parallel Downloads
```bash
Download multiple files simultaneously
curl -O https://example.com/file1.zip -O https://example.com/file2.zip &
curl -O https://example.com/file3.zip -O https://example.com/file4.zip &
wait
```
Uploading Files
Form File Upload
```bash
Upload file as form data
curl -F "file=@document.pdf" https://httpbin.org/post
Upload with additional form fields
curl -F "file=@image.jpg" -F "description=Profile photo" https://api.example.com/upload
```
Binary File Upload
```bash
Upload raw binary data
curl -X POST --data-binary @file.zip https://api.example.com/upload
Upload with specific content type
curl -X POST \
-H "Content-Type: application/octet-stream" \
--data-binary @binary_file.dat \
https://api.example.com/binary
```
FTP Upload
```bash
Upload to FTP server
curl -T local_file.txt ftp://username:password@ftp.example.com/remote_file.txt
Upload multiple files
curl -T "{file1.txt,file2.txt}" ftp://username:password@ftp.example.com/
```
Troubleshooting Common Issues
Connection Problems
SSL/TLS Certificate Issues
```bash
Skip certificate verification (not recommended for production)
curl -k https://self-signed.example.com
Specify CA certificate
curl --cacert ca-bundle.crt https://secure.example.com
Use system CA bundle
curl --capath /etc/ssl/certs https://secure.example.com
```
Timeout Issues
```bash
Set connection timeout
curl --connect-timeout 30 https://slow-server.example.com
Set maximum total time
curl --max-time 60 https://api.example.com/slow-endpoint
Retry on failure
curl --retry 3 --retry-delay 5 https://unreliable-api.example.com
```
DNS Resolution Problems
```bash
Use specific DNS server
curl --dns-servers 8.8.8.8,8.8.4.4 https://example.com
Resolve hostname to specific IP
curl --resolve example.com:443:192.168.1.100 https://example.com
Force IPv4 or IPv6
curl -4 https://example.com # IPv4 only
curl -6 https://example.com # IPv6 only
```
HTTP Error Handling
Handle HTTP Error Codes
```bash
Fail silently on HTTP errors
curl -f https://httpbin.org/status/404
Show error message for HTTP errors
curl -f -s -S https://httpbin.org/status/500
Continue on HTTP errors but show status
curl -w "HTTP Status: %{http_code}\n" https://httpbin.org/status/404
```
Response Code Checking
```bash
Get only HTTP status code
curl -s -o /dev/null -w "%{http_code}" https://example.com
Detailed response information
curl -w "Status: %{http_code}\nTime: %{time_total}s\nSize: %{size_download} bytes\n" \
https://example.com
```
Debugging and Logging
Verbose Output for Debugging
```bash
Maximum verbosity
curl -v https://httpbin.org/get
Trace ASCII output
curl --trace-ascii trace.log https://httpbin.org/get
Binary trace output
curl --trace trace.bin https://httpbin.org/get
```
Network Interface Issues
```bash
Use specific network interface
curl --interface eth0 https://example.com
Use specific local IP address
curl --local-port 8080-8090 https://example.com
```
Best Practices
Security Considerations
Protect Sensitive Data
```bash
Use environment variables for sensitive data
export API_TOKEN="your_secret_token"
curl -H "Authorization: Bearer $API_TOKEN" https://api.example.com
Use configuration files with proper permissions
chmod 600 ~/.curlrc
echo 'header = "Authorization: Bearer secret_token"' >> ~/.curlrc
```
Validate SSL Certificates
```bash
Always verify SSL certificates in production
curl --cert-status https://secure-api.example.com
Pin certificate fingerprints for critical services
curl --pinnedpubkey sha256//base64encodedkey https://critical-api.example.com
```
Performance Optimization
Connection Reuse
```bash
Enable HTTP/2 when available
curl --http2 https://http2-enabled-site.example.com
Keep-alive connections
curl --keepalive-time 60 https://api.example.com/endpoint1
```
Compression
```bash
Enable compression
curl --compressed https://api.example.com/large-response
Specify accepted encodings
curl -H "Accept-Encoding: gzip, deflate, br" https://example.com
```
Bandwidth Management
```bash
Limit download speed
curl --limit-rate 100k https://example.com/large-file.zip
Limit upload speed
curl --limit-rate 50k -T large-upload.zip ftp://ftp.example.com/
```
Scripting Best Practices
Error Handling in Scripts
```bash
#!/bin/bash
response=$(curl -s -w "HTTPSTATUS:%{http_code}" https://api.example.com/data)
http_code=$(echo $response | tr -d '\n' | sed -e 's/.*HTTPSTATUS://')
body=$(echo $response | sed -e 's/HTTPSTATUS\:.*//g')
if [ $http_code -eq 200 ]; then
echo "Success: $body"
else
echo "HTTP Error: $http_code"
exit 1
fi
```
Configuration Management
```bash
Use .curlrc for default options
echo 'user-agent = "MyScript/1.0"' >> ~/.curlrc
echo 'connect-timeout = 30' >> ~/.curlrc
echo 'max-time = 120' >> ~/.curlrc
```
Logging and Monitoring
```bash
Comprehensive logging function
log_curl_request() {
local url=$1
local logfile="curl_$(date +%Y%m%d).log"
curl -w "URL: %{url_effective}\nHTTP: %{http_code}\nTime: %{time_total}s\nSize: %{size_download}\n" \
-s -o response.json "$url" >> "$logfile"
}
```
Performance Optimization
Advanced Connection Management
Connection Pooling
```bash
Multiple requests using same connection
curl --keepalive-time 30 \
https://api.example.com/endpoint1 \
https://api.example.com/endpoint2 \
https://api.example.com/endpoint3
```
HTTP/2 and HTTP/3
```bash
Force HTTP/2
curl --http2-prior-knowledge https://http2.example.com
Try HTTP/3 if available
curl --http3 https://http3-enabled.example.com
```
Caching Strategies
Conditional Requests
```bash
Use If-Modified-Since header
curl -H "If-Modified-Since: Wed, 21 Oct 2023 07:28:00 GMT" \
https://api.example.com/data
Use ETags for cache validation
curl -H "If-None-Match: \"etag-value-here\"" \
https://api.example.com/resource
```
Monitoring and Metrics
Performance Metrics
```bash
Comprehensive timing information
curl -w "DNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTLS: %{time_appconnect}s\nTransfer: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
https://example.com
Size and speed metrics
curl -w "Downloaded: %{size_download} bytes\nSpeed: %{speed_download} bytes/sec\n" \
https://example.com/large-file
```
Conclusion
curl is an incredibly versatile and powerful tool for fetching URLs and interacting with web services. Throughout this comprehensive guide, we've explored everything from basic URL fetching to advanced authentication methods, file transfers, and performance optimization techniques.
Key Takeaways
1. Versatility: curl supports numerous protocols and authentication methods, making it suitable for virtually any web-related task.
2. Flexibility: The extensive range of options allows you to customize requests precisely to your needs, from simple GET requests to complex API interactions.
3. Reliability: With proper error handling, timeout configuration, and retry mechanisms, curl can be used reliably in production environments.
4. Security: When configured correctly with proper SSL verification and credential management, curl provides secure communication with web services.
5. Performance: Advanced features like HTTP/2 support, connection reuse, and compression help optimize performance for high-volume operations.
Next Steps
Now that you have a comprehensive understanding of curl, consider these next steps:
- Practice: Experiment with different APIs and services to reinforce your learning
- Automation: Integrate curl into shell scripts and automation workflows
- Monitoring: Use curl for health checks and monitoring in your infrastructure
- API Testing: Leverage curl for testing and debugging API endpoints
- Advanced Topics: Explore libcurl for programmatic integration in applications
Additional Resources
- Official Documentation: Visit the curl website for the most up-to-date documentation
- Community: Join curl mailing lists and forums for support and advanced discussions
- Integration: Explore how curl integrates with other tools like jq for JSON processing
- Alternatives: Consider complementary tools like wget, httpie, and postman for different use cases
Remember that mastering curl is an ongoing process. As web technologies evolve, new features and best practices emerge. Stay updated with the latest curl releases and continue practicing with real-world scenarios to maintain and improve your skills.
Whether you're downloading files, testing APIs, or automating web interactions, curl remains an essential tool in any developer's or system administrator's toolkit. With the knowledge gained from this guide, you're well-equipped to handle virtually any URL fetching task that comes your way.