Skip to main content

Ralsina.Me — Roberto Alsina's website

So much cool stuff, so little time.

I read Za­ck Rus­in's blog about bench­mark­ing vec­tor graph­ic APIs... then I see a com­ment men­tion­ing Anti­grain. Then I check the anti­grain ex­am­ples, and they are gor­geous, and pret­ty fast! Even on a lame Sis630!

Then it hit me... I am nev­er go­ing to do any­ht­ing with it (or with Qt's Arthur). Maybe I am get­ting old, but I see a swirl of cool soft­ware... dpars­er... asymp­tote... tx­t2­tags ... (and those are on­ly the ones I saw in the last week).

All of them are about some­thing that in­ter­ests me, but I sim­ply can't do any­thing. I mean, would it be cool to write a vec­tor-ap­p-­for-kids with anti­grain (or Arthur?) Sure!

Would I like to im­ple­ment this shel­l-style lan­guage I have float­ing in my head for a year us­ing dpars­er (or py­pars­ing?) Yeah! Would I like to hack a Trac plug­in us­ing tx­t2­tags (or re­struc­tured tex­t?) Sure!

But when can I do that? I have my busi­ness, my wife, her preg­nan­cy, my oth­er project­s... maybe that's what hap­pens when you be­come old. You gath­er enough bag­gage that you can't lift any more back­packs in your trek.

But what can I do with all the ideas swirling in my head? Re­al­ly! What?

Moving load around with netpipes.

I had an emer­gen­cy. The CPU us­age of a cer­tain mail serv­er was rais­ing, and the cul­prit was clamd.

For some rea­son, in the last few month­s, the CPU us­age of clamd kept ris­ing, and was now near 70% av­er­age of the server's CPU.

Re­mov­ing the an­tivirus is, of course, not an op­tion. On the oth­er hand, per­for­mance was start­ing to suf­fer.

The usu­al re­sponse would be a full re­tool­ing of the se­tup, mul­ti­ple SMTP servers han­dling the load against a cen­tral stor­age server, cla­mav run­ning on each SMT­P... but switch­ing to that in­volves a full reim­ple­men­ta­tion of the sys­tem. Be­cause of the an­tivirus??? Hell no.

So, I start­ed in­ves­ti­gat­ing how I could move clamd to an­oth­er box, like I did with spa­mas­sas­sin. It was not pret­ty.

  • cla­­mav has a pro­­to­­col de­fined for con­nec­t­ing to re­­mote server­s.

  • cla­­mav does­n't have a client for it.

  • clamd-stream-­­client does­n't seem to work.

So, I thought... let's be orig­i­nal. What do I ac­tu­al­ly need?

I need to be able to call clamd­scan, and have it scan the cur­rent fold­er. Based on its ex­it sta­tus code (0/1/2) the mail is ac­cept­ed, re­ject­ed, tem­po­rar­ily re­ject­ed.

Hav­ing the same fold­ers struc­ture avail­able to two box­es is triv­ial. I have NF­S, lots of band­width and an­oth­er com­put­er.

Run­ning clamd­scan in the sec­ond box, scan­ning those fold­ers is triv­ial too.

The miss­ing piece is a way to tell the sec­ond box's clamd to scan, and get the ex­it code in the mail serv­er.

En­ter net­pipes!

Net­pipes is soft­ware to "make TCP sock­ets us­able from the shel­l". You can find it at http://we­b.pur­ple­frog.­com/~thoth/net­pipes/net­pipes.html.

And here's a re­place­ment clamd­scan which works the way I want­ed it:

#!/bin/dash
exit `echo \$PWD | hose 192.168.1.53 9000 --slave `

This ver­sion takes the fold­er you want to scan as an ar­gu­men­t:

#!/bin/dash
exit `echo \$* | hose 192.168.1.53 9000 --slave `

And here is the "serv­er side". First net­clam.sh:

#!/bin/dash -x
read args
/usr/bin/clamdscan \$args >/dev/null 2>&1
echo \$?

Then the "net­work code":

faucet 9000  --in --out /usr/bin/netclam.sh

And there you have it. Cla­mAV moved to an­oth­er serv­er. With 5 lines of shell code.

No, I don't get a dime from them

For a few months I have been us­ing an un­man­aged vir­tu­al pri­vate serv­er from Tek­ton­ic, and I love it.

What's that? Let's take it one word at a time, and then some more.

  1. It's a serv­er: which means it's a ful­l-ish lin­ux in­­stal­la­­tion. So it is ca­­pa­ble of do­ing lots of things. I can run all sorts of weird python thin­­gies in it if I wan­t. IMAPS and SSMT­P? No prob­le­­mo.

  2. It's pri­­vate: which means I am root on it. I have the shel­l. I choose what to in­­stal­l.

  3. It's vir­­tu­al: it's a Vir­­tuoz­­zo par­ti­­tion in a re­al serv­er. That means no cus­­tom ker­nel mod­­ules, and that since al­­most ev­ery­thing is shared with oth­­er in­­s­tances, 5GB of disk and 128MB of RAM go a long way.

  4. It's un­­man­aged: which means I man­age it. Which is just the way I pre­fer it, since that's my job.

  5. It's cheap. I start­ed on a 8 dol­lars a month plan (which does­n't seem to be there any­­more, the cur­rent cheap­­est is a 15 dol­lars plan).

  6. It's a throw­­away. I want to host some client as a favour? I just put it there. I could even rent an­oth­er of these servers for a while, use it, then close it. Back­­up­s? Click­­ing on a we­b­­page saves the im­age! Oth­­er than that... I back it.

  7. Fixed IP­s. All you want (for ex­­tra coin­s).

  8. A home away from home. All my stuff is there. I need it, I get it. With­­out both­­er­ing about hav­ing my own serv­er at home via no-ip or some­­such (which of course I still have too ;-)

  9. It works. It hard­­ly ev­er break­s. And hav­ing sur­­vived ex­pen­­sive, man­aged server­s, this ba­­by is work­ing just as well.

  10. It's a nice gift. Sup­­pose you have a con­nec­­tion to a free soft­­ware pro­­jec­t/LUG/­­fam­i­­ly/what­ev­er, and they need a place on the in­­ter­net. Why not spon­­sor them with some­thing like this? I of­fered one to PyAr (which did­n't take it, but it's the thought that counts ;-)

  11. The ul­ti­­mate learn­ing ex­pe­ri­ence: you can re­­s­tore the sys­tem in 2 min­utes. Want to play/learn sysad­min­ing? Do it on the re­al vir­­tu­al thing! Much cheap­­er than hos­ing your own box ;-)

  12. They of­fer a good ser­vice. So, peo­­ple should know about it. And of course... if you know a sim­i­lar, but even bet­ter deal... I'm all ears!

A different UNIX Part II: A better shell language

One of the things peo­ple study when they "learn unix" is shell script­ing and us­age. Be­cause ev­ery sys­tem has a shel­l, and if you learn to use it in­ter­ac­tive­ly, you are half way there to au­tomat­ing sys­tem tasks!

Let's con­sid­er that for a mo­men­t... what are the odds that the same lan­guage can be good for in­ter­ac­tive use and for pro­gram­ming? I say slim.

Not to men­tion that learn­ing shell as a way to learn unix is like go­ing to a school that teach­es TV pro­duc­tion, and study­ing the re­mote. While use­ful, not re­al­ly the im­por­tant tool (ok, that anal­o­gy does­n't work at al­l. But it sounds neat, does­n't it?).

The first thing is that to­day's Lin­ux dom­i­na­tion of the unix­sphere has caused a se­ri­ous mono­cul­ture in shell script­ing: ev­ery­one us­es bash. The more en­light­ened ones may check that their scripts work on some oth­er Bourne-style shel­l.

There are no im­por­tant dis­tri­bu­tions (or pro­pri­etary unix­es) that use a csh or any­thing like it. De­bian has a pol­i­cy that things should work with­out bashism­s. That's about as good as it get­s.

Writ­ing a dozen pages on how shell sucks would be triv­ial. But un­in­ter­est­ing.

So, let's think it over, and start from the top.

What should a shell script­ing lan­guage be like?

What does­n't mat­ter?

Let's tack­le these things. I in­vite any­one to add ex­tra ideas in the com­ments sec­tion.

What should a shell scripting language be like?

  • In­­ter­pret­ed (ob­vi­ous)

  • Dy­­nam­ic typ­ing (y­ou will be switch­ing ints to strs and vicev­er­sa all the time).

  • Easy in­­­cor­po­ra­­tion of oth­­er pro­­grams as func­­tion­s/meth­od­s/what­ev­er.

    That pret­­ty much is what makes it a shel­l. ls should be in­­dis­­t­in­guish­able from some­thing writ­ten us­ing the shell it­­self.

  • Pipes. This is a must. Unix has a bazil­lion tools meant to be used in com­­mand pipe­­lines. You can im­­ple­­ment a RDBMS us­ing that kind of thing (check out nosql). Lev­er­age that.

    But even here, on its strength, the shell is not per­fec­t. Why can't I eas­i­­ly pipe stderr and std­out to dif­fer­­ent pro­cess­es? Why can't I pipe the same thing to two pro­cess­es at the same time (yes, I know how to do it with a neat trick ;-)

  • Glob­bing. *.txt should give you a list of files. This is one of the ob­vi­ous things where sh is bro­ken. *.txt may be a string or a list, de­pend­ing on con­tex­t... and a list is just a se­ries of strings with blanks. That is one of the bazil­lion things that makes writ­ing shell scripts (at least good ones) hard:

    [ralsina@monty ralsina]\$ echo *out
    a.out
    [ralsina@monty ralsina]\$ echo *outa
    *outa
  • A list da­­ta type. No, writ­ing strings sep­a­rat­ed with spa­ces is not ok. Maybe a python-style dic­­tio­­nary as well?

  • Func­­tions (ob­vi­ous)

  • Li­braries (and ok, the shell source mech­a­nism seems good enough)

  • Stand­alone. It should­n't spawn sh for any rea­­son ;-)

What doesn't matter?

  • Per­­for­­mance. Ok, it mat­ters that a five-­lin­er does­n't take 50 min­utes un­­less it has to. But 1 sec­onds or two sec­ond­s? not that im­­por­­tan­t.

  • Ob­­ject ori­en­­ta­­tion. I don't see it be­ing too use­­ful. Shell scripts are old-­­fash­ioned :-)

  • Com­­pat­i­­bil­i­­ty to cur­rent shel­l­s. Come on. Why be like some­thing that suck­­s? ;-)

Now, the example

Let's con­sid­er a typ­i­cal piece of shell script and a re­write in a more rea­son­able syn­tax.

This is bash (no it does­n't work on any oth­er shel­l, I think):

DAEMONS=( syslog network cron )

# Start daemons
for daemon in "\${DAEMONS[@]}"; do
      if [ "\$daemon" = "\${daemon#!}" ]; then
              if [ "\$daemon" = "\${daemon#@}" ]; then
                      /etc/rc.d/\$daemon start
              else
                      stat_bkgd "Starting \${daemon:1}"
                      (/etc/rc.d/\${daemon:1} start) &>/dev/null &
              fi
      fi
done

And since DAE­MONS is some­thing the ad­min writes, this script lets you shoot in the foot in half a dozen ways, too.

How about this:

DAEMONS=["syslog","network","cron"]

# Start daemons
for daemon in DAEMONS {
      if ( daemon[0] != "!" ) {
              if ( daemon[0] == "@" ) {
                      stat_bkgd ("Starting "+daemon[1:])
                      /etc/rc.d/+daemon[1:] ("start") &> /dev/null &
              } else {
                      /etc/rc.d/+daemon ("start")
              }
      }
}

Of couse the syn­tax is some­thing I just made up as I was writ­ing, but is­n't it nicer al­ready?

Try htop

Ev­er need­ed a process mon­i­tor that runs in a ter­mi­nal? Have you been us­ing top? Use Htop in­stead. Much, much, much nicer!


Contents © 2000-2024 Roberto Alsina