Gestalt Shift: Useful

Showing posts with label Useful. Show all posts

Friday, June 28, 2013

Shared library writeup: Part 4

I ended last time at saying how the dynamic linker had three main tasks: Determine and load dependencies, relocate the application and dependencies, and initialize the application and dependencies, and how the key to speeding up all of these was to have fewer dependencies in the application.

Now, we're going to look at the relocation process more thorougly. First of all, what's going on? What does 'relocation' mean?

I'm by no means an expert in this, but I'm going to venture an attempt at an explanation: After an ELF object has been compiled, it has an entry point address - in other words, at which memory address the file resides, and if control is transferred to that address, the ELF object will start executing.

However, there are at least a couple of caveats here. First of all: Even if your ELF object has a fixed entry point address, it doesn't mean it will be loaded into actual physical memory at this address. Each process gets its own virtual memory space, which is a mapping from physical memory to a 'platonic' memory space. So the application might get loaded into the entry point address of the virtual memory space, but this address will correspond to another address entirely in physical space.

The second point is that if we're not talking about an executable, but rather a dynamic shared object, as we are here (or rather, we have one executable with a potentially high number of dynamic shared objects that are to be associated with it), the entry point address isn't even the entry point address it will end up with in the final executable - it will get shifted depending on what the linker determines is the best way to combine the addresses of all participating DSOs. This means that all 'internal' addresses in that object will be shifted by the same amount as well. This is what we're currently talking about when we use the term 'relocation'.

So the thing we're going to talk about is how the linker accomplishes this relocation - and especially the part where it has to synchronize all the load addresses, etc. First, it must be noted that there are two types of dependencies a given DSO can have. For one, you can have dependencies that are located within the same object - which I imagine happens when you create an object file with two functions/subroutines and one of them depends on the other - and for another, you can have dependencies that come from a different object.

The first kind of dependency is easy to handle, given that you know the 'new' entry point address of the object in question. For each such dependency, you calculate its relative offset from the entry point, and then simply add this offset to the new entry point.

The second type of dependency resolution is more involved, and I'm going to talk about that more the next time.

Thursday, June 27, 2013

Multiplying lots of matrices in NumPy

The other day I found myself needing to perform matrix multiplication in Python, using NumPy. Well, what's the big deal, you say? You do know that there exists a dot method, right?

Yes, I do know that, you smart internet person you. However, my problem was that I had a number of matrices for which I wanted to perform the same type of matrix multiplication. I had on the order of a hundred thousand five by two matrices that were to be transposed and multiplied with another hundred thousand five by two matrices.

Well, duh, you say. The dot method can handle more than two dimensions, you know. Yeah, I know that as well. However, it doesn't handle it the way I needed for this task. I wanted to end up with a hundred thousand two by two matrices. Had I used dot, I would have ended up with a hundred thousand by two by hundred thousand by two matrix.

So, I had to improvise:

>>> A.shape
(100000, 2, 5)
>>> B.shape 
 (100000, 2, 5)
 
>>> result = np.sum(np.swapaxes(A, 1, 2)[:, np.newaxis, :, :] * B[:, :, :,
np.newaxis], 2)

Kind of involved, but it worked. I got the initial idea from here, but the solution given here only works for symmetrical matrices - for non-symmetrical ones you have to shift the newaxis one step to the left, or it will violate the broadcasting rules of NumPy.

Tuesday, June 25, 2013

Shared library writeup: Part 3

I ended last time by talking about how the dynamic linker had to run before an ELF file could run. Currently, in other words, the dynamic linker has been loaded into the memory space of the process that we want to run, and we're going to run the linker before we can run our actual application.

We cannot run the linker quite yet, however. The linker has to know where it should transfer control to after it has completed whatever it is supposed to do. This is accomplished by the kernel - it puts an auxiliary vector on top of the process' stack. This vector is a name-value array (or in Python terms, a dictionary), and it already contains the values I talked about last time - the PT_INTERP value which is contained in the p_offset field of the ELF program header. In addition to these, a number of other values are added to the auxiliary vector. The values that can be added like this are defined in the elf.h header file, and have the prefix AT_. After this vector has been set up, control is finally transferred to the dynamic linker. The linker's entry point is defined in the ELF header of the linker, in the e_entry field.

Trusting the linker

At this point the (dynamic) linker is supposed to do its magic. It has three tasks:

Determine and load dependencies
Relocate the application and all dependencies
Initialize the application and dependencies in the correct order

The paper doesn't go into detail on the first and last point - I suppose the author considers them trivial - but the key to speeding up these two points is having fewer dependencies: the time needed to sort through all dependencies and load them in order usually scales worse than linearly with the number of dependencies. However, the relocation process is usually the most expensive part of the linker's tasks, scaling at least as

O(R+nr)

, where

R

is the number of relative relocations,

r

is the number of absolute relocations, and

n

is the total number of DSOs in the application. Also, there could be additional overhead associated with the symbols in the process, depending on which ELF extensions are being used and so on. In such cases, the cost can be

O(R+nrlog s)

, where

s

is the number of symbols.

Thursday, June 20, 2013

Check for duplicates in Python

Today's trick: Check whether a Python container cont contains duplicates!

if len(cont) != len(set(cont)): raise myError

Neat!

Wednesday, June 19, 2013

Fugitive.vim

All of my posts presently are written in the nick of time. It could be symptomatic of either bad planning or too much to do, but I will try to keep up the current schedule for a while longer at least.

As stated in another post, one of the plugins I am using for vim is the fugitive.vim plugin, written by Tim Pope. Its description is "A Git wrapper so awesome, it should be illegal". And so far, I have to agree - it is pretty darn awesome.

My favorite feature so far is the :Gdiff mode. It has done wonders to the tidyness of my git history. Used to be, I would edit a file, spot a minor bug that was unrelated to whatever I was currently implementing, and then I had to either a)fix the bug and implement whatever, commiting everything in one large chunk, thus messing up the Git history, or b) stash the changes so far, fix the bug, commit it, then continue implementing.

Option b) isn't actually that bad in itself. It just takes a little more time. However, if you spot multiple bugs, or do several separate modifications of the file, the stashing can get a little messy.

Now, the Gdiff command opens up the file you're editing together with the current HEAD version of that file. (Or actually, that's not exactly what is opened, but I have to research more about how Git does stuff before I have more to say). It opens these two files in Diff mode (which I didn't even know about prior to this). It then allows you to choose hunks to stage for a commit, so that you don't have to commit everything in the file at once, if you don't want to. (A hunk is one continuous piece of changed 'stuff'). However, you can even break up the hunks by specifying the lines you want to 'diffput'.

In short - it's awesome. It has other neat features as well, but those will have to come at another time. I might also write a more technical piece on the Gdiff thing.

Tuesday, June 18, 2013

TRAPs

As a true nerd, I am currently GMing an RPG campaign.

Preparing for sessions can be a chore - I find myself wondering how to structure the stuff I'm making up, and thinking about ways to organize often take more time than actual campaign writing.

However, there is one technique that I am extremely thankful for coming across - the TRAPs method, invented by Ry at Enworld forums: ry's Threats, Rewards, Assets and Problems (TRAPs)

Not that kind.

It's really making my life as a GM a whole lot easier, because it's a simple algorithm for fleshing out an adventure or encounter: Everything you add should be either a Threat, Reward, Asset or Problem. If you're introducing something that's neither, it's ineffective. And before you complain about 'atmosphere' and so on - you can easily turn either of these things into stuff that provide atmosphere.

Right now, I don't have enough time to write about this more elaborately (I have to prepare for the session), but once I have tried it out a bit more, I will try to write down my experiences.

Monday, June 17, 2013

Shared library writeup: Part 2

I ended last time talking about how the program header file contains references to the segments of an ELF file. Now, these segments can have various access permissions - some parts are executable but not writable, while some are writable but not executable.

Having a lot of non-writable segments is a good thing, since it means, in addition to data being protected from unintentional or malignant modification, that these segments can be shared if there are several applications that use them.

The way that the kernel knows what kind of segments are of what type is by reading the program header table, where this information is located. This table is represented by C structs called ELF32_Phdr or ELF64_Phdr.

However, the program header table is not located at a fixed place in an ELF file. The only thing that is fixed is the ELF header, which is always put at 'offset' zero, meaning the beginning of the file, essentially. (Offset means how many bytes from the beginning something is located). This header is also represented by a C struct, called ELF32_Ehdr or ELF64_Ehdr (the 32 or 64 refers to whether the computer architecture is 32-bit or 64-bit, respectively - i.e., all its registers, memory addresses and buses have sizes of 32 bits or 64 bits.)

Now, the ELF header struct contains several pieces of information (fields) that are necessary to determine where the program header is. Writing down these pieces means essentially copy-pasting the article I'm reading, so I think I will not go down to that level of granulation.

Once the kernel has found the program header table, it can start reading information about each segment. The first thing it needs to know is which type the segment is, which is represented by the p_type field of the program header table struct. If this field has the value PT_LOAD it means that this segment is 'loadable'. (other values this field can have is PT_DYNAMIC, which means that this segment contains dynamic linking information, PT_NOTE, which means the segment contains auxilliary notes, et cetera.) If the p_type field has the value PT_LOAD, the kernel must, in addition to knowing where the segment starts, also know how big it is, which is specified in the p_filesz field. There are also a couple of fields that describe where the segment is located in virtual memory space. However, the actual offset in virtual memory space is irrelevant for DSOs that are not linked, since they haven't been assigned a specific place in virtual memory space. For executables and so-called 'prelinked' DSOs (meaning that they have been bound to an executable even if they're dynamic), the offset is relevant.

However, even though the offset in memory is irrelevant for unlinked DSOs, the virtual memory size of the segment is relevant. This is because the actual memory space that the segment needs can be larger than the size of the segment in-file. When the kernel loads the segment into memory, if the requested memory size is larger than the segment size, the extra memory is initialized with zeroes. This is practical if there are so-called BSS sections in the segment. BSS is an old name for a section of data that contains only zero bits. Thus, as long as extraneous memory is initialized with zeroes, this is a good way to save space - you only need to know how large the bss section is, add that size to the current size of the segment, and the kernel handles the rest. An example of a BSS section is a section containing uninitialized variables in C code, since such variables are set to zero in C anyway.

Finally, each segment has a logical set of permissions that is defined in the p_flags field of the program header struct - whether the segment is writable, readable, executable or any combination of the three.

After this, the virtual address space for the ELF executable is set up. However, the executable binary at this point only contains the segments that had the PT_LOAD value in the p_type field. The dynamically linked segments are not yet loaded - they only have an address in virtual memory. Therefore, before execution can start, another program must be executed - the dynamic linker.

The dynamic linker is a program just like the executable we're trying to run, so it has to go through all the above steps. The difference is that the linker is a complete binary, and it should also be relocatable. Which linker is used is not specified by the kernel - it is contained in a special segment in the ELF file, which has the PT_INTERP value in the p_type field. This segment is just a null-terminated string which specifies which linker to use. And the load address of the linker should not conflict with any of the executables on which it is being run.

This ends the second part of the writeup. And there's plenty left..

Friday, June 14, 2013

Shared library writeup: Part 1

During my daily work this week, I found myself struggling with shared libraries, linking them, and the various compiler flags needed to make the type of library you want. I decided to actually learn this stuff once and for all, and so I am currently reading "How to write shared libraries" by Ulrich Drepper. I decided this was a perfect opportunity to multitask - both write stuff for the blog and learn something! Espec ially since you learn much better by writing about it. Hence, this will be the first part of my writeup of Drepper's paper.

In the most abstract, libraries are collections of code gathered into one file for easy reuse. They can be static, meaning that if you want to use the code in a program, the compiler must take the code contained in the library and bake it into the program upon compilation. Alternatively, they can also be shared or dynamic, meaning that they are not included in the program upon compilation, but the program contains mention of the libraries, so that on run-time, the program loads the library and incorporates it into the program.

Nowadays, (on Unix-like systems) libraries are handled by the so-called ELF (Executable Linkage format), which is a common file format that are used not just for libraries, but for executables and other types of files as well.

Earlier, other formats, such as a.out and the Common Object File Format (COFF) were used. The disadvantage with these were that when these libraries did not support relocation.

When you have a piece of compiled code (typically in what's called an object file), this file will contain a relocation table. Such a table is a list of pointers to various addresses within that object file, and these addresses are typically given relative to the beginning of the file (which is typically zero). When combining several such object files into one large executable, this object-file-specific list must typically be changed, since the object file now is not located at 'zero' anymore, but rather at some arbitrary point within the new executable.Then, when the executable is to be executed, the addresses are again modified to reflect the actual addresses in RAM. This last part is what is not supported by the old library formats.

This essentially means that each library must be given an absolute address in virtual memory upon creation, and that some central authority must keep track of where the various shared libraries are stored. In addition: when we make additions to a library that is supposed to be shared, we don't want to have to tell all the applications that used the old version that the library has changed - as long as the new version still contains all the stuff we need for our application, it should still work for that application without having to re-link the application with the new version of the library. This means that the table that points to where the various parts of the library are located must be kept separate from the actual library, and it must actually keep track of the pointer tables of all the old versions of that library - once a function had been added to a library, its address lasted forever. New additions to a library would just append to the existing table. In short, a.out and COFF were not very practical for use as shared libraries, although they did make the program run fast, since there is no relocation of table pointers at run time.

Enter ELF

ELF is, as mentioned before, a common file type for both applications, object files, libraries and more. It is therefore very easy to make a library once you know how to make an application - you just pass in an additional compiler flag. The only difference between them is that applications usually have a fixed load address, that is, the (virtual) memory address into which they are loaded upon execution. There is a special class of applications, called Position Independent Executables (PIEs) that don't even have a fixed load address, and for those, the difference between applications and shared libraries are even smaller.

For an application that contains no dynamic components (no shared libraries etc.), its execution is straightforward: The application is loaded into memory, then instruction at the 'entry point' memory address is executed, which should start a chain of events that ends with the termination of the program.

For applications that do contain dynamic components, it is less straightforward: There must be another program that can coordinate the application with the DSOs (Dynamic Shared Objects) before execution of the program starts.

The ELF file structure

ELF files usually contain the following:

the file header
the Program header table
the Section header table
Sections

Sections are the meat of the ELF file - they contain all the information about what is actually going on. The other structures are just there to organize the sections.

The section header table is a table with references to the various sections of the file. The program header table contains references to various groupings of the sections. So you might say that the section header table describes each 'atom' of the file, whereas the program header table collects these atoms into 'molecules' and makes sensible chunks, called segments, that are sections that work together to form a coherent whole.

End of part 1 of the writeup! And I'm only on page 3 of the paper!

Wednesday, June 12, 2013

Vim

I use vim for editing.

There is no overwhelmingly reasonable reason I chose vim. Several years ago, when I first started programming on a significant basis, I started reading about the editor wars, and I immediately knew I had to make a choice and stick with it. I think the article that stands out as the main reason for my choice is this one:

Pirates and ninjas: Emacs or Vi?"

Of all the articles that could have been the basis for my choice, this is probably one of the least reasonable. However, when I did read this article, I didn't know anything about what would be useful when programming. And so, connecting both editors to ninjas and pirates made it easy to make a choice (which, I think, matters not that much in the long run anyway).

Vim: for when cannons and cutlasses just won't cut it.

Ninjas simply appeals more to me than pirates do, and knowing nothing else, I chose vim. I cannot say I regret the choice, but that could easily be just because I haven't tried emacs.

(Short aside: When I read the above article, I didn't know who Richard Stallman was. However, as it turns out, if I had known, there would have been more of an incentive to choose vim.)

Both editors benefit from plugins. I haven't manually installed many - I think the only ones I currently have installed is the fugitive plugin, written by Tim Pope, and the python indent script written by Eric McSween. I will elaborate on the former in a later post.

Of course, knowing the commands available to you is also something that makes you effective, whichever editor you use. I don't know a tenth of all the stuff that vim can do in theory, but this Stack Exchange question was a lot of help to me.

There are a couple of keybindings I find very helpful. Mapping caps lock to ESC, 't to :tabnew, 's to :w and 'q to :q are some that save plenty of keystrokes in the long run.

The more you use an editor, the better you get at it and the less you gain by switching. So it's likely I will keep using vim for the unforeseeable future. And that's ok.

Monday, June 10, 2013

NumPy structured arrays

I'm programming quite a bit in Python, and my understanding of that language is incremental. Due to the nature of my work, I also work a lot with NumPy. Today I had to solve the following problem:

Take an input dictionary
Create a NumPy structured array with the keys as field names, the datatypes of the values as the field datatypes, and the values themselves as the array elements.

Turns out, even though i have worked with the ndarray dtype a lot, I don't have a very good grasp of the structured array. This post will not be precise, but just a summary of what I have understood today.

First of all, I thought that a structured array would be like a 'normal' NumPy array, just that one of the dimensions had field names and data types associated with them.

But I think I am wrong in this - I think it's more a matter of a structured array being a NumPy array, where each element in the array is a structure (which makes sense once I think about it).

For instance, you can't slice a structured array according to the first interpretation:


In [1]: dtype = ''.join(('uint8,', 4*'int16,', 'int16'))

In [2]: b = np.array([(0, 1, 2, 3, 4, 5)], dtype=dtype)

In [3]: b.shape

Out[3]: (1,)

In [4]: b[0, 3]

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

----> 1 b[0, 3]

IndexError: too many indices

However, if you treat the result of a slice as a separate array, it works:


In [5]: b[0][3]

Out[5]: 3

This is very basic, I know. But it's something I learned today. And that's what this blog mainly is for. Hopefully I will learn more interesting stuff later.

Wednesday, June 5, 2013

Wrapping your body

Sometimes I sleep badly. I have no problem going to sleep, but sometimes I wake up too early for some reason and have trouble getting back to sleep, it's as if my mind as on some kind of high (excited about the coming day, maybe?) and isn't able to calm down until a while later, at which point I have missed two hours of sleep and just know that the day is going to be crap.

As long as you are single and have flexible work-hours, this doesn't need to be that big of a deal. If you wake up early you can go do some work and then go bac k to sleep when your mind has calmed down a little. However, that's a quite limited group of people.

Something that I've experimented with the last couple of sleepless early mornings has been a relaxation technique that I did a couple of times as a teenager. It 's pretty simple - you lie on your back, calm down, and then start thinking about your toes, of relaxing them. You then move upwards through your body and focus on each body part, relaxing it, thinking that it becomes heavier. It feels a little like wrapping your body in some kind of 'Relax-o-wrap'. You end with your mouth, nose and eyes. After that, all of your body is mentally wrapped up, and you actually feel like lifting your arm, for instance, would ruin the wrap.

Then, you start focusing on your breath. You inhale deeply, down to the bottom of your lungs, so that it is your stomach that rises and falls, not your chest. I was taught to inhale through the nose and exhale through the mouth, but it's not vital. The vital part is that you focus on your breathing with your mind. In the beginning it's frustrating and hard to focus, but after a while you suddenly realize that you almost dozed off for a second. Then you actually do doze off for a second. Then for longer. Then you start dreaming - my dreams have been weird during this exercise - often they're about falling and flying etc.

After waking up, I usally do some kind of 'unwrapping' routine, focusing on each body part and making it 'unheavy' again. I don't know how important that is, bu t it maintains the illusion of a wrap around your body, which I think is important for this exercise.

So far, I haven't been able to go back to 'normal' sleep with this technique, but I find a certain type of sleep which I think is far superior to being awake. Maybe with time, regular sleep comes as well. Progress will be reported (if I can remember to do it).

Thursday, May 30, 2013

Presentations

Yesterday I held a presentation for a community of hobby enthusiasts in my field.

Writing presentations can be quite enlightening, since you ideally should know what you're talking about. Granted, I think it's easier to talk to non-specialists because then you can get away with more handwaving, but sometimes, when you want to explain stuff in the most basic way possible, it requires you to think things through - what's really going on in this process I'm trying to describe? How can it be explained in plain words?

There are some of things I loathe when it comes to listening to a presentation:

Slides containing walls of text

Like this

When the slides are exact copies of the speaker's manuscript
When the speaker provides too little background. I would rather hear things one time too many than one time too little.

I therefore try to heed these problematic points when I make presentations, and I work by the following guidelines:

I view slides not as the main conveyors of information, but rather as a tool to complement what I say. What I say is the most important thing in a presentation - not what's on the slides.
I try to illustrate my points with pictures when possible - it's much easier to grasp and you don't have to read while the presenter is talking
When I have to resort to text, I try to keep it as short and simple as possible. The slides provide cues for what I will say, but they're only cues, not manuscripts.
I try (time permitting) to provide as much background as the listeners need to understand what I'm talking about.

The last point of course requires you to know your audience. My audience yesterday was quite varied in terms of background - some understand math, some don't. I wanted to explain a little bit how we analyse data in my field, which required me to explain Fourier decomposition and power spectrum analysis to the audience. I didn't quite succeed, even if I did it as basic as I could. However, the more math-savvy parts of the audience grasped what I said. So it would have been a choice between just dropping my goal of explaining what we do, and having a section of the talk being unintelligible. My choice was the latter even if I didn't know that's what I chose.

As for choosing the style of your slides: sometimes I think certain slide styles can look cool, but elaborate slide designs can sometimes take focus away from what you're saying. I use the Latex beamer class with just a very clean setup for my presentations.