Python Tips

Posted on Wednesday, Jul 01, 2020 in programming • Tagged with python, tips, lists, functools, itertools, timezone

A collection of small Python scripts and tips.

This post is based on a Twitter thread I started in April 2020 and works as a centralized way to read all the tips in an easier format than Twitter's 280 characters.

Both will be updated frequently.

List Flatten without explicit loops

  • Using Itertools' chain
1
2
3
4
5
6
import itertools

test = [[-1, -2], [30, 40], [25, 35]]
list(itertools.chain.from_iterable(test))

>>  [-1, -2, 30, 40, 25, 35]
1
2
3
4
test =  [[-1, -2], [30, 40], [25, 35]]
map(int, ''.join(c for c in test.__str__() if c not in '[]').split(',') )

>> [-1, -2, 30, 40, 25, 35]

Not pretty in my opinion ;)

Count individual items of any iterable

1
2
3
4
5
6
7
8
from collections import Counter
count = Counter(['a', 'b', 'c', 'a', 'a', 'b', 'd'])

print(count)
>> Counter({'a': 3, 'b': 2, 'c': 1, 'd': 1})

count['a']
>> 3

Repeat a series of values from any iterable

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import itertools as it

data = it.cycle([1, 2])

for i in range(10):
    print(next(data))

>> 1
2
1
2
1
2
1
2
1
2

Name slices to reuse them

1
2
3
4
5
6
7
8
# slice(start, end, step)

STEPTWO = slice(None, None, 2)
integer_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

integer_list[STEPTWO]

>> [0, 2, 4, 6, 8]

This is the same as:

1
integer_list[::2]

Reverse any "indexable" collection that supports slices

1
2
3
4
5
6
# slice(None, None, -1) or [::-1]

integer_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
integer_list[::-1]

>> [9, 8, 7, 6, 5 …

Continue reading

[Debian] Changing Docker's default /var/lib/docker directory

Posted on Tuesday, Sep 10, 2019 in programming • Tagged with debian, docker, directory, systemd

By default, Docker storage its files in the /var/lib/docker directory in the file system. If, like me, you're running out of disk space in the default file system, you can quickly change the default directory. This are the steps I followed in Debian 10, it should work with any distro that uses systemd

As superuser, modify the systemd docker startup script. Edit the file /lib/systemd/system/docker.service replacing the ExecStart command:

1
$ sudo vim /var/systemd/system/docker.service

Change the line:

1
ExecStart=/usr/bin/docker daemon -H fd://

to

1
ExecStart=/usr/bin/docker daemon -g /new/path -H fd://

then stop the service, and be sure that the daemon is completely stopped:

1
2
$ sudo systemctl stop docker
$ ps -aux | grep -i docker

The only output should be the one from grep. Then you can reload the system and -optionally- synchronize your current docker data to the new directory

1
2
3
$ sudo systemctl daemon-reload
$ sudo mkdir /new/path
$ sudo rsync -aqxP /var/lib/docker /new/path

Your docker installation should be running in the new data directory:

1
$ ps -aux | grep -i docker

That's it.


Visualizing Tweets During Election Weekend

Posted on Thursday, Dec 10, 2015 in programming • Tagged with data, tweets, twitter, elections, visualize, folium, python, matplotlib, pylab

On Sunday (Dec. 06) parliamentary elections were held in Venzuela. I took the opportunity to fetch some data from Twitter's public stream and use graphics to show what was happening in "social media" during that day. To achieve this I relied in the Python programming language.

The Data

I decided to monitor all geolocalized tweets with my home city (Caracas) as origin over a period of five days, two days before the elections, the election day, and two days after it.

To fetch the data, I used the Twython library since I've worked with it before. Therefore, it should be easy to setup a quick script to fetch the data and save it in a text file.

Once the API and OAUTH keys are correctly following Twython's documentation, the next step is to fetch the data originating in certain location, passing it as a parameter to the streaming API. Twitter uses a set of bounding boxes to track the location of a tweet and only geolocated Tweets falling within the requested bounding boxes will be included.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import json, os
from twython import TwythonStreamer

caracas_location = "-67.017225,10.419554,-66.778716,10.52006"

class MyStreamer(TwythonStreamer):
    def on_success(self, data):
        if data['coordinates'] is not None:
            with open('tweets.txt', 'a') as f:
                f.write(json.dumps(data))
                f.write('\n')

    def on_error(self, status_code, data):
        with open('errors.txt', 'a') as f:
            f.write('error: {0}: {1}'.format(status_code, data))



stream = MyStreamer(os.environ['APP_KEY'], os.environ['APP_SECRET'],
                        os.environ['OAUTH_TOKEN'], os.environ['OAUTH_TOKEN_SECRET'])

stream.statuses.filter(locations=caracas_location)

Since I just wanted to fetch geolocalized tweets, I checked for the existance of the coordinates entity in the …


Continue reading

Quick tip: Adding Variables to env in a Virtualenv (for development purpose)

Posted on Thursday, Jul 16, 2015 in programming • Tagged with python, enviroment, virtualenv, export, bash

If you're working with third party APIs, you might find code like YOUR_SECRET_KEY="some secret api key" in your source code, this is a bad practices for a lot of reasons (security, source code sharing, etc). Instead, the recommended way to manage this kind of situation is to add the value as a enviroment variable, and read it in your code with something like this:

1
2
import os
os.environ.get('YOUR_SECRET_KEY')

So, how do you avoid to add the variable to the enviroment each time you do some coding? If you're working with virtualenv you simply add it in the env/bin/activate script:

1
2
YOUR_SECRET_KEY="some secret api key"
export YOUR_SECRET_KEY

Debian (development tips)

Posted on Saturday, Dec 13, 2014 in programming • Tagged with debian, help, documentation, libraries, pil, libjpeg, locale, matplotlib, libxml, ubuntu, freetype, postgresql, database, scipy

These are some extra steps that I've found necessary when starting development in a recently-installed Debian machine.

Jpeg support in PIL and pillow.

1
2
3
4
$ sudo apt-get install libjpeg libjpeg-dev libfreetype6 libfreetype6-dev zlib1g-dev
$ sudo ln -s /usr/lib/`uname -i`-linux-gnu/libfreetype.so /usr/lib/
$ sudo ln -s /usr/lib/`uname -i`-linux-gnu/libjpeg.so /usr/lib/
$ sudo ln -s /usr/lib/`uname -i`-linux-gnu/libz.so /usr/lib/

Installing lxml in Python (Debian based).

If you're getting the "fatal error: libxml/xmlversion.h: No such file or directory" error, just install the following development files:

1
$ sudo apt-get install python-dev libxml2-dev libxslt1-dev

Problems with 'matplotlib' and freetype.

If you're having problems installing matplotlib in a Python virtualenv, and are getting the 'freetype missing' error, you sould install the development files for freetype, and (in most cases) rebuild the python dependencies for matplotlib.

1
2
$ sudo apt-get  -u install libfreetype6-dev
$ sudo apt-get build-dep python-matplotlib

After that, you can just use pip normally to install matplotlib

1
$ pip install matplotlib

PS. I know of cases where you have to update python-virtualenv and python-pip after you use build-dep. Just apt-get upgrade your installation. Today I learned about the 'pydoc' command from Python.

Problems with locale

In some machines I've found problems when setting locales from Python. First check the results of running

1
$ locale

In my case is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
LANG=es_VE.UTF-8
LANGUAGE=es_VE:es
LC_CTYPE="es_VE.UTF-8"
LC_NUMERIC="es_VE.UTF-8"
LC_TIME="es_VE.UTF-8"
LC_COLLATE="es_VE.UTF-8"
LC_MONETARY="es_VE.UTF-8"
LC_MESSAGES="es_VE.UTF-8"
LC_PAPER="es_VE.UTF-8"
LC_NAME="es_VE.UTF-8"
LC_ADDRESS="es_VE.UTF-8"
LC_TELEPHONE="es_VE.UTF-8"
LC_MEASUREMENT="es_VE.UTF-8"
LC_IDENTIFICATION="es_VE.UTF-8"
LC_ALL=

Then use:

1
2
3
4
5
6
import locale

try:
    locale.setlocale(locale …

Continue reading