Skip to main content

Ralsina.Me — Roberto Alsina's website

Six Years

casamiento

And she's still here.

Scraping doesn't hurt

I am in gen­er­al al­ler­gic to HTM­L, spe­cial­ly when it comes to pars­ing it. How­ev­er, ev­ery now and then some­thing comes up and it's fun to keep the mus­cles stretched.

So, con­sid­er the Ted Talks site. They have a re­al­ly nice ta­ble with in­for­ma­tion about their talk­s, just in case you want to do some­thing with them.

But how do you get that in­for­ma­tion? By scrap­ing it. And what's an easy way to do it? By us­ing Python and Beau­ti­ful­Soup:

from BeautifulSoup import BeautifulSoup
import urllib

# Read the whole page.
data = urllib.urlopen('http://www.ted.com/talks/quick-list').read()
# Parse it
soup = BeautifulSoup(data)

# Find the table with the data
table = soup.findAll('table', attrs= {"class": "downloads notranslate"})[0]
# Get the rows, skip the first one
rows = table.findAll('tr')[1:]

items = []
# For each row, get the data
# And store it somewhere
for row in rows:
    cells = row.findAll('td')
    item = {}
    item['date'] = cells[0].text
    item['event'] = cells[1].text
    item['title'] = cells[2].text
    item['duration'] = cells[3].text
    item['links'] = [a['href'] for a in cells[4].findAll('a')]
    items.append(item)

And that's it! Sur­pris­ing­ly pain-free!

To write, and to write what.

Some of you may know I have writ­ten about 30% of a book, called "Python No Muerde", avail­able at http://no­muerde.net­man­ager­s.­com.ar (in span­ish on­ly).That book has stag­nat­ed for a long time.

On the oth­er hand, I wrote a very pop­u­lar se­ries of post­s, called PyQt by Ex­am­ple, which has (y­ou guessed it) stag­nat­ed for a long time.

The main prob­lem with the book was that I tried to cov­er way too much ground. When com­plete, it would be a 500 page book, and that would in­volve writ­ing half a dozen ex­am­ple app­s, some of them in ar­eas I am no ex­pert.

The main prob­lem with the post se­ries is that the ex­am­ple is lame (a TO­DO ap­p!) and ex­pand­ing it is bor­ing.

¡So, what bet­ter way to fix both things at on­ce, than to merge them!

I will leave Python No Muerde as it is, and will do a new book, called PyQt No Muerde. It will keep the tone and lan­guage of Python No Muerde, and will even share some chap­ter­s, but will fo­cus on de­vel­op­ing a PyQt app or two, in­stead of the much more am­bi­tious goals of Python No Muerde. It will be about 200 pages.

I have ac­quired per­mis­sion from my su­pe­ri­ors (my wife) to work on this project a cou­ple of hours a day, in the ear­ly morn­ing. So, it may move for­ward, or it may not. This is, as usu­al, an ex­per­i­men­t, not a prom­ise.

Antisocial Networks

I love http://­goodread­s.­com very much. It has mea­sur­ably im­proved my life as a read­er. I have read au­thors I would­n't have read with­out it, books from those au­thors I would have ig­nored, and keeps track of what I read, am read­ing and will read.

What it has nev­er been for me, is a so­cial net­work. I would be about as hap­py with it if I knew noone else on the site, if it were just me and a bazil­lion strangers whose taste I can leech of­f.

Sure, I have a few friends there nowa­days, but I hard­ly ev­er do any­thing "so­cial" be­yond ac­cept­ing re­quests and post­ing re­views which I have no idea if some­one read­s.

I love Flickr where I put most of my pic­tures (soon: all of my pic­tures). It's cheap and I can up­load an al­most in­fi­nite amount of pics there, and I can share them with friends and fam­i­ly some­times (by re­post­ing them to face­book).

They were even kind enough to store the pic­tures I up­load­ed as a free us­er un­til I paid for the space to store them 5 years lat­er.

I love Twit­ter be­cause it's a place to post short things that don't de­serve a blog post, to chat­ter with friends and not-­so-friend­s, to know more peo­ple, and to waste some time ev­ery day.

One of those things is not like the oth­er­s. One of those things I use for its so­cial fea­tures, the oth­ers I use for oth­er rea­son­s, and I don't re­al­ly care about them be­ing so­cial or not.

I think nowa­days, for a so­cial net­work to suc­ceed, it has to cater to the an­ti­so­cial, at least at first, when you know noone there. I don't go to Flickr to de­bate. I don't go to Goodreads to chat. I go there to put pic­tures and keep my books straight. And that's what kept me there long enough to meet peo­ple.

The blogs I don't have

  • Things you on­­ly like or be­lieve be­­cause your mom said so.

  • Tips for Time Trav­el­er­s.

  • Cute plants and their an­tic­s.

  • 1001 ways to peal a cat.

  • Things mor­ti­­cians say.

  • Trav­el­ing for Time Tip­per­s.

  • Coins of the world: what do they taste like?

  • Things found in peo­­ple's noses.

  • Sur­prise, that is not chick­­en!

  • Time for Tip Trav­el­er­s.

  • World of Lin­t.


Contents © 2000-2024 Roberto Alsina