Python и http-клиенты

Response Content¶

We can read the content of the server’s response. Consider the GitHub timeline
again:

>>> import requests

>>> r = requests.get('https://api.github.com/events')
>>> r.text
'[{"repository":{"open_issues":0,"url":"https://github.com/...

Requests will automatically decode content from the server. Most unicode
charsets are seamlessly decoded.

When you make a request, Requests makes educated guesses about the encoding of
the response based on the HTTP headers. The text encoding guessed by Requests
is used when you access . You can find out what encoding Requests is
using, and change it, using the property:

>>> r.encoding
'utf-8'
>>> r.encoding = 'ISO-8859-1'

If you change the encoding, Requests will use the new value of
whenever you call . You might want to do this in any situation where
you can apply special logic to work out what the encoding of the content will
be. For example, HTML and XML have the ability to specify their encoding in
their body. In situations like this, you should use to find the
encoding, and then set . This will let you use with
the correct encoding.

Python HTTP Client

In this post on python HTTP module, we will try attempting making connections and making HTTP requests like GET, POST and PUT. Let’s get started.

Making HTTP Connections

We will start with the simplest thing HTTP module can do. We can easily make HTTP connections using this module. Here is a sample program:

Let’s see the output for this program:
In this script, we connected to the URL on Port 80 with a specific timeout.

Python HTTP GET

Now, we will use HTTP client to get a response and a status from a URL. Let’s look at a code snippet:

In above script, we used a URL and checked the status with the connection object. Let’s see the output for this program:
Remember to close a connection once you’re done with the connection object. Also, notice that we used a to establish the connection as the website is served over HTTPS protocol.

Getting SSL: CERTIFICATE_VERIFY_FAILED Error?

When I first executed above program, I got following error related to SSL certificates.

From the output, it was clear that it has to do something with the SSL certificates. But website certificate is fine, so it has to be something with my setup. After some googling, I found that on MacOS, we need to run file present in the Python installation directory to fix this issue. Below image shows the output produced by this command execution, it looks like it’s installing latest certificates to be used when making SSL connections.

Note that I got this error on Mac OS. However, on my Ubuntu system, it worked perfectly fine.

Python HTTP Client Ubuntu

Getting the Header list from Response

From the response we receive, the headers usually also contain important information about the type of data sent back from the server and the response status as well. We can get a list of headers from the response object itself. Let’s look at a code snippet which is a little-modified version of the last program:

Let’s see the output for this program:

Python HTTP POST

We can POST data to a URL as well with the HTTP module and get a response back. Here is a sample program:

Let’s see the output for this program:
Feel free to use the HTTP Bin library to try more requests.

Python HTTP PUT Request

Of course, we can also perform a PUT request using the HTTP module itself. We will use the last program itself. Let’s look at a code snippet:

Let’s see the output for this program:

Timeouts¶

Most requests to external servers should have a timeout attached, in case the
server is not responding in a timely manner. By default, requests do not time
out unless a timeout value is set explicitly. Without a timeout, your code may
hang for minutes or more.

The connect timeout is the number of seconds Requests will wait for your
client to establish a connection to a remote machine (corresponding to the
connect()) call on the socket. It’s a good practice to set connect timeouts
to slightly larger than a multiple of 3, which is the default TCP packet
retransmission window.

Once your client has connected to the server and sent the HTTP request, the
read timeout is the number of seconds the client will wait for the server
to send a response. (Specifically, it’s the number of seconds that the client
will wait between bytes sent from the server. In 99.9% of cases, this is the
time before the server sends the first byte).

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the and the
timeouts. Specify a tuple if you would like to set the values separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever for
a response, by passing None as a timeout value and then retrieving a cup of
coffee.

r = requests.get('https://github.com', timeout=None)

Body Content Workflow¶

By default, when you make a request, the body of the response is downloaded
immediately. You can override this behaviour and defer downloading the response
body until you access the
attribute with the parameter:

tarball_url = 'https://github.com/psf/requests/tarball/master'
r = requests.get(tarball_url, stream=True)

At this point only the response headers have been downloaded and the connection
remains open, hence allowing us to make content retrieval conditional:

if int(r.headers'content-length']) < TOO_LONG
  content = r.content
  ...

You can further control the workflow by use of the
and methods.
Alternatively, you can read the undecoded body from the underlying
urllib3 at
.

If you set to when making a request, Requests cannot
release the connection back to the pool unless you consume all the data or call
. This can lead to
inefficiency with connections. If you find yourself partially reading request
bodies (or not reading them at all) while using , you should
make the request within a statement to ensure it’s always closed:

Custom Authentication¶

Requests allows you to use specify your own authentication mechanism.

Any callable which is passed as the argument to a request method will
have the opportunity to modify the request before it is dispatched.

Authentication implementations are subclasses of ,
and are easy to define. Requests provides two common authentication scheme
implementations in : and
.

Let’s pretend that we have a web service that will only respond if the
header is set to a password value. Unlikely, but just go with it.

from requests.auth import AuthBase

class PizzaAuth(AuthBase):
    """Attaches HTTP Pizza Authentication to the given Request object."""
    def __init__(self, username):
        # setup any auth-related data here
        self.username = username

    def __call__(self, r):
        # modify and return the request
        r.headers'X-Pizza' = self.username
        return r

Then, we can make a request using our Pizza Auth:

Raw Response Content¶

In the rare case that you’d like to get the raw socket response from the
server, you can access . If you want to do this, make sure you set
in your initial request. Once you do, you can do this:

>>> r = requests.get('https://api.github.com/events', stream=True)

>>> r.raw
<urllib3.response.HTTPResponse object at 0x101194810>

>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'

In general, however, you should use a pattern like this to save what is being
streamed to a file:

with open(filename, 'wb') as fd
    for chunk in r.iter_content(chunk_size=128):
        fd.write(chunk)

Using will handle a lot of what you would otherwise
have to handle when using directly. When streaming a
download, the above is the preferred and recommended way to retrieve the
content. Note that can be freely adjusted to a number that
may better fit your use cases.

SSL Cert Verification¶

Requests verifies SSL certificates for HTTPS requests, just like a web browser.
By default, SSL verification is enabled, and Requests will throw a SSLError if
it’s unable to verify the certificate:

>>> requests.get('https://requestb.in')
requests.exceptions.SSLError: hostname 'requestb.in' doesn't match either of '*.herokuapp.com', 'herokuapp.com'

I don’t have SSL setup on this domain, so it throws an exception. Excellent. GitHub does though:

>>> requests.get('https://github.com')
<Response >

You can pass the path to a CA_BUNDLE file or directory with certificates of trusted CAs:

>>> requests.get('https://github.com', verify='/path/to/certfile')

or persistent:

s = requests.Session()
s.verify = '/path/to/certfile'

Note

If is set to a path to a directory, the directory must have been processed using
the c_rehash utility supplied with OpenSSL.

This list of trusted CAs can also be specified through the environment variable.
If is not set, will be used as fallback.

Requests can also ignore verifying the SSL certificate if you set to False:

>>> requests.get('https://kennethreitz.org', verify=False)
<Response >

Note that when is set to , requests will accept any TLS
certificate presented by the server, and will ignore hostname mismatches
and/or expired certificates, which will make your application vulnerable to
man-in-the-middle (MitM) attacks. Setting verify to may be useful
during local development or testing.

Python NumPy

NumPy IntroNumPy Getting StartedNumPy Creating ArraysNumPy Array IndexingNumPy Array SlicingNumPy Data TypesNumPy Copy vs ViewNumPy Array ShapeNumPy Array ReshapeNumPy Array IteratingNumPy Array JoinNumPy Array SplitNumPy Array SearchNumPy Array SortNumPy Array FilterNumPy Random
Random Intro
Data Distribution
Random Permutation
Seaborn Module
Normal Distribution
Binomial Distribution
Poisson Distribution
Uniform Distribution
Logistic Distribution
Multinomial Distribution
Exponential Distribution
Chi Square Distribution
Rayleigh Distribution
Pareto Distribution
Zipf Distribution

NumPy ufunc
ufunc Intro
ufunc Create Function
ufunc Simple Arithmetic
ufunc Rounding Decimals
ufunc Logs
ufunc Summations
ufunc Products
ufunc Differences
ufunc Finding LCM
ufunc Finding GCD
ufunc Trigonometric
ufunc Hyperbolic
ufunc Set Operations

Parameter Values

Parameter Description
url Try it Required. The url of the request
params Try it Optional. A dictionary, list of tuples or bytes to send as a query string.Default
allow_redirects Try it Optional. A Boolean to enable/disable redirection.Default
(allowing redirects)
auth Try it Optional. A tuple to enable a certain HTTP authentication.Default

cert Try it Optional. A String or Tuple specifying a cert file or key.Default

cookies Try it Optional. A dictionary of cookies to send to the specified url.Default

headers Try it Optional. A dictionary of HTTP headers to send to the specified url.Default
proxies Try it Optional. A dictionary of the protocol to the proxy url.Default

stream Try it Optional. A Boolean indication if the response should be immediately downloaded (False) or streamed (True).Default

timeout Try it Optional. A number, or a tuple, indicating how many seconds to wait for the client to make a connection and/or send a response.Default which means the request will continue
until the connection is closed
verify Try it
Try it
Optional. A Boolean or a String indication to verify the servers TLS certificate or not.Default
Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *

Adblock
detector