Intellectual Property Rights

DRM is a hot topic in all venues at the moment, and I find myself conflicted. On the one hand, I want to see people get paid for their work. On the other hand, I find the price structures/schedules and distribution of most IP to be utterly irrational.

If I am going to purchase a piece of IP, say a TV show, I feel I should have the right to use it where ever I wish, however I wish. If I want to rip it from DVD and put it on my PSP to carry around with me, it should not only be legal, but easy to do without having to break the law to accomplish.

You should not limit my access to the IP I have legally purchased, either. Take Bioshock (PC Version), for example. It is entirely a single-player experience. However, in order to even make it boot the first time, you have to be connected to the internet first. I find this unacceptable. What if, for example, I were to lose my job, and as a cost-cutting measure I had to give up my internet access? What if I wanted to play some of my legally purchased PC games during this time of hardship to help pass the time? (Hypothetical, I know I should be out looking for a job :) Well, too bad for me.

Another example: I bought the PS3 version of Fallout 3. After tooling around with it a bit, I realized my PC was MUCH better suited to playing the game than my PS3 was (control, graphics, etc are much better on the PC imo). The problem is, in order to play the game I purchased on my PC instead of my PS3, I would've had to go out and purchase an entirely new copy of the game, doubling my expenditure since video games can not be returned to stores. Again, this is totally unacceptable.

The fact of the matter is, you will not stop determined pirates. There are people out there who will steal anything just because they can. What IP producers need to realize if they want to curb piracy is that people want things when they want them, how they want them, and at a price point that doesn't feel like we are buying some corporate exec a new Mercedes. Most people will pay for something they feel is a fair value. I can't even begin to name all the video games in my collection I've bought that I wish I could get my money back for. And I have plenty more that I feel were more than worth the money.

You will not stop a determined pirate. Period. (I reiterate because I want it to sink in.) The will do it because they enjoy doing it. They do it because they can. They do it because you make it a challenge for them to do it, and they love the challenge. And they have yet to be stymied. Infact, the only people who ARE hurt by copy protection schemes are the people who paid out in the first place. You are not hurting the pirates with your draconian DRM schemes. Pirates don't care. They rip the copy protection out of your binaries. It means nothing to them. But for people like me, who reinstall our operating systems on a pretty regular basis, or make incremental changes to our hardware configurations, it is enough to make me want to pirate the games I already own just to get rid of the shitty DRM.

One last example: My brother and I, between us, own every season of The Simpsons available on DVD to date. However, we never watch them. Instead, we have sitting on our media server, in .avi format, those exact same episodes WHICH WE PAID FOR, that we can stream to all the various devices in the house (PC's, PS3, XBox). However, I'm pretty sure ripping all those, even though we own the DVD's, is illegal. Yet, the convenience of having all those episodes in a big list I can choose from is incredible.

Or, to sum it up:http://xkcd.com/488/

Creating layouts in PDF files in PHP - Using Zend_PDF

Recently, I've found myself having to generate dynamic report pages in PDF format using PHP. Working with PDF files has been a pain in the ass at best, so when the time came to wrap PDF reports into the new version of my company's software, I decided to build my own PDF class to encapsulate another library. The basic library choices were Zend_PDF, since we are already using Zend Framework, or FPDF, was was what was chosen in the current version. I ended up choosing Zend_PDF since we got it for free without further includes, but honestly the underlying library doesn't matter so much so long as it provides a means of drawing images and text.

So, the main problem faced when generating PDF reports is layout. That is to say, there is no layout management in PDF libraries. What you are given is a blank canvas to draw on. There are no niceties like tables or container classes that you get in HTML. Since I was laying out very tabular data, I really wanted some way to manage the layout in a generic way without having to code for each specific report.

To achieve this effect, I build essentially four classes: PDF, Table, Row, Column. They existed in the following relationships: PDF consists of tables. Tables consist of rows. Rows consist of columns. Columns have text inside them. By implementing iterators for tables and rows, it is very easy to loop over the structure to place the rows and columns. Tables store an array of rows, rows store an array of columns, and columns store their individual text and options. To make your classes iterable, have them implement IteratorAggregate and provide a function that returns
an ArrayIterator built from your inner array:

class Pdf_Row implements IteratorAggregate
{
...
/**
* Allows for iteration over individual columns
*
* @return Iterator
*/
public function getIterator()
{
return new ArrayIterator($this->_cols);
}
....
}

And you can then put your rows in a foreach loop to loop over your columns.

By having each element return $this, you get a nice fluent API. To populate a new table, all you need to do is this:
$table->addRow()
->addCol('Column 1')
->addCol('Column 2');

To add a second row, simply repeat that process. This allows PDF generation in a style already familiar to all HTML developers, the table. There is a ton of customization that could be done here, basically implementation of all the same options you get with an HTML table, but for my own purposes, all I really needed was the ability to make text wrap, which I will describe a little later. So, I now have a table structure I can loop over and do the layout. But I still just have a blank canvas, so I'm going to have to employ some math.

We need to know a few things to do the layout:
For each table, what is the highest column count for an individual row?
What size paper are we using? (Usually A4, but not always).
We are working under the assumption that we are building reports on paper, so we will want margins around the page. How big do we want our margins?
What is the size of the font we are using?

Once we have these, we can begin to layout our table. We will use the Zend_PDF library's drawText function to put our text on the page. One thing I found odd when working with Zend_PDF was they start their upper left corner at (0, $pageHeight), as opposed to (0, 0). What this means is, in order to traverse down the page, we have to subtract (tend towards 0) from the line pointer, instead of adding to it. Another thing to realize is that Zend_PDF uses points for its units, not pixels, so all units have to be converted into points.

So let's begin our layout. We will be using A4 paper size, Times_Roman font, size 10.

We start by creating a Zend_Pdf object, which is the base class we will be working with.
// Creat the PDF Object.
$pdf = new Zend_Pdf();

We want a 1 inch margin around the page, so we convert 1 inch into 72 points.
if(empty($this->_margin)) {
$this->setMargin(75);
}

Initialize the pages to zero, and create the first page.
// First page
$currentPage = 0;
$pdf->pages[$currentPage] = $pdf->newPage($this->_paperSize);

Now, we need to track the cursor, so we store it in $currentHeight. As mentioned above, the top left corner is not (0,0), but is actually (0, 842). The bottom right corner is at (595, 0).
// Move pointer to the top of the page.
$currentHeight = $this->_maxHeight;

Here I am adding a header image to the very top of the page. Think of it like putting your company's letterhead on. $this->_imageHeader contains the path to the image to load.
if(!empty($this->_headerImage)) {
$image = Zend_Pdf_Image::imageWithPath($this->_headerImage);

As mentioned earlier, all units are in points, and we retrieve the image dimensions in pixels, so we need to convert them. Since there are 72 points per inch, and roughly 96 pixels per inch, we get the ratio 72:96 points:pixels, which is 3:4, or 0.75. This gives us the conversion units to convert from pixels to points.
// Convert from pixels to points.
$height = $image->getPixelHeight() * 0.75;
$width = $image->getPixelWidth() * 0.75;

To place the image, I wanted it centered on the X axis, so I had to find two things: The Length of the image on the X axis, and the total width of the page on the X axis. Centering is achieved by finding the difference, and adding half of that distance onto the left-side of the image as a margin. So if the page is 100x100pts, and the image is 50x50pts wide, half the difference is 25pts, so the image would be centered at the top from (25,100) to (75,50).

// Parameters go in: Left, Bottom, Right, Top : X1, Y2, X2, Y1
// The offset is how far to shift the image right from 0 to achieve centering on the X axis.
$offset = ($this->_maxWidth - $width) / 2;
$x1 = $offset + 0; $y1 = $this->_maxHeight - $this->_margin;
$x2 = $offset + $width; $y2 = $y1 - $height;

// Draw the header.
$pdf->pages[$currentPage]->drawImage($image, $x1, $y2, $x2, $y1);

We want to move the current line pointer down below the image we just drew. We also want a small buffer between it, so we double the line high movement to give us a padding.
$currentHeight = $y2 - ($this->_fontSize*2);
} else {
// If no header, set the first line below the margin.
$currentHeight = $this->_maxHeight - $this->_margin;
}

Now we come to the layout of the tables themselves. To calculate the width of each individual cell in the table, we take the highest number of max columns and divide the writable area (width - margin space) by it. This gives us the width of each column. ie. 595 - 72 (left margin) - 72 (right margin) = 451 writable area. If our largest row has 6 columns, then each column is 75.16 points wide. Therefore, we begin at the start of the margin, and simply move 75.16 points to the right for each subsequent column.

foreach($this->_model as $table) {
// Find the highest column count for all rows.
$maxCols = $table->getMaxCols();

foreach($table as $row) {
$x = $this->_margin;

NumTempLines comes into play when you have to wrap the text in a cell, which I will explain below.

$numTempLines = 0;

foreach($row as $col) {

Set the font we are using to draw the text in this cell.
if($col->isBold() == true) {
$font = $this->_fontBold;
} else {
$font = $this->_font;
}
This offset is the width of our cell as explained above.
// How far to move it on the X axis for the next column.
$offset = ($this->_maxWidth - ($this->_margin*2)) / $maxCols;

Here we pass in the column text to see if it needs to be wrapped.
// Wrap the text if necessary
$text = $this->_wrapText($col->getText(), $offset, $font, $this->_fontSize);

// Set the font to be used.
$pdf->pages[$currentPage]->setFont($font, $this->_fontSize);

Store the current height so we don't lose it.
$tempHeight = $currentHeight;

When we wrap the text, we have to allow room below the current line to accommodate the new lines created in the wrapping process. To do that, we need to insert those rows under us. We store the number of new rows we created in $numTempLines so that, once we are done drawing this row, we can move the line pointer by that many rows down, instead of just one, so that we aren't drawing the next row on top of our wrapped text.

// If there is more than one line returned from the wrap...
if(count($text) > $numTempLines) {
$numTempLines = count($text);
}

// Draw the text.
foreach($text as $line) {
$pdf->pages[$currentPage]->drawText($line, $x, $tempHeight);
$tempHeight -= $this->_fontSize;
}
Move the x-axis cursor to the next cell. This is how we keep columns lined up.
// Move the x-axis cursor.
$x += $offset;
}
If we had to wrap columns, we move the line pointer by as many rows as we had to move. If not, we move it one row down.
// Did we have to wrap any columns? If so, move the next row that much.
if($numTempLines > 0) {
$currentHeight -= ($this->_fontSize * $numTempLines);
} else {
$currentHeight -= $this->_fontSize;
}

Because we won't always know how many lines we have to insert due to the fact that we are dealing with dynamic data, we need to make sure we are starting a new page when we hit the bottom page boundary. We want to perform this check every time we move the line pointer, or we risk drawing more text than a page can handle, and we cause problems. We want to start a new page every time our line counter gets down to the bottom margin. We increment the page counter, reset the line pointer, and continue to draw in the PDF.

// Wrap the page if necessary.
if($currentHeight <= $this->_margin) {
$currentPage++;
$pdf->pages[$currentPage] = $pdf->newPage($this->_paperSize);
$currentHeight = $this->_maxHeight - $this->_margin;
}
}
}

Finally, all our tables have been drawn, so we just have to write it out to a file, and we're done.
// Save it.
$pdf->save('test.pdf');
}

So that's how we take a blank canvas and use math to lay a table on top of it. Now, to build a new report, all I have to do is this:
$table->addRow()
->addCol('Name')
->addCol($someName)
->addCol('Date')
->addCol($someDate);

$table->addRow()
->addCol('Age')
->addCol($someAge)
->addCol('Phone')
->addCol($somePhone);

And I get a 2 row, 4 column table layout properly spaced and with text wrapping, if needed. Further, if I decide down the road to do with a different PDF library, all I have to change is the above build function to change the function calls to the new library. The interface stays the same. (We would want to also make sure to find out where the new library defines the top-left corner of the page, whether (0,0) or (0,height).)

Now, I promised to discuss how wrapping was achieved, so here it is:

/**
* Wraps the given text to the colWidth provided.
*
* @param string text - The text to wrap
* @param int colWidth - The width of a column
* @param object font - The font to use.
* @param int fontSize - The font size in use.
*
* @return array - An array of wrapped text, one line per row.
*/
private function _wrapText($text, $colWidth, $font, $fontSize)
{
$characters = array();

Obviously, if the string is empty, we are done here.
if(strlen($text) == 0) {
return array();
}

What we want here is the ascii value of each character in the string, pushed into an array.
// Collect information on each character.
$characters = array_map('ord', (array) $text);

Now, to find out how wide our characters are, we need information about the font being used. This is stored in the Zend_Font_* object being used as the font, so we can query it. Glyphs are the internal representations of the characters within the object. So we need the glyph numbers for each character. Once we have those, we can find the widths of each one. This returns an array of the widths of every character in the array. Finally, we need to know how to use those widths, since the units are not in points, so we have to do some conversions. The only number they give us is the units per em, so we have to make due with that. A quick function call gives us that number.
// Find out the units being used for the current font.
$glyphs = $font->glyphNumbersForCharacters($characters);
$widths = $font->widthsForGlyphs($glyphs);
$units = $font->getUnitsPerEm();

Armed with those numbers, we can do some math. Yay! A quick array_sum gives us the total width of the string in glyph-units (for lack of a better term), which we can convert into Em(is Ems the plural of that? Or is there even a plural?) by division, which gives us the width in Em. We then convert into Points by multiplying Em by the point size of the font, and round to the nearest integer. We now have the length of the string in points.

// Calculate the length of the string.
$length = intval(((array_sum($widths) / $units) * $fontSize) + 0.5);

Having the length of the whole thing is nice, but we also want the average length of a single character. This way, l and W don't throw us of too badly, because both are given the same allotment of space.
// Find out the average length of an individual character.
$avg = intval(($length / strlen($text)) + 0.5);

Now we need to decide how many characters can fit on a single line in the sell. We take the total width of the cell and divide by the average character width, and thus we know how many characters per line.
// How many characters to wrap at, given the size of the cell.
$numToWrap = $colWidth / $avg;

PHP has a built in function wordwrap, which will give you a wrapped string back if you provide an initial string and the number of characters to wrap at. Since that's what we just calculated, we are now armed to wrap the text. We want it in the form of an array though, hence the explode call.
$newText = explode('
', wordwrap($text, $numToWrap, '
'));

Finally, we return our newly wrapped text. Congratulations, you've just implemented text wrapping.
return $newText;
}

A note on changing libraries when doing text wrapping: We are using the Zend_Font_* functions glyphNumbersForCharacters and widthsForGlyphs. If the new library doesn't provide similar information, the best you can do is guess at the average width. It could be tested empirically, if needed.

So, I now have a MUCH easier to work with wrapper for the Zend_Pdf library, which can be generically applied to many different situations. There is still a lot I could add to it, add more options to each cell, etc, which I might add as I come up with a need. It's kinda a pain working with a blank canvas at first, until you realize that it actually gives you incredible flexibility.

Doing a project like this definitely makes you appreciate what the people who wrote your HTML renderer went through.

Coders with no idea

I've seen it a lot lately. People who have never seen a well-built application in their lives are the lead programmers at companies. Then, when a new programmer comes in, and tries to fix things, or tries to do new things better than the ways it has been done before, the wall of "This is how the rest of it is done, and we want it to conform" is thrown up. Now, not only have you proven to the new guy coming in that you are incapable of recognizing a well done design, but you also tend to drive him away by forcing him to do something that runs contrary to his nature.

On productivity

There are two things that are killing my productivity right now in my current job (actually, there are thousands, but I will limit this entry to the two that are giving me time to write this entry). Waiting for data to move across the network in the form of database back ups, and complete lack of standards for tracking bugs and their fixes.

I spend probably 3-4 hours a day just setting up backup databases from clients in order to reproduce/test/fix bugs they're having. Given how the system is setup, it is almost impossible to test most bugs from a simple test DB setup. The reason for this, if anyone is interested, is that we allow dynamic forms to be built, and they all behave slightly differently based upon the configuration. So, to reproduce most of these bugs, I need the specific data which triggered it. Because of HIPAA standards, though, I can't just reach over and dig through their databases. I have to request a backup from IT, which may or may not be encrypted, and load it locally to play with.

The general setup goes along these lines:
I send an email to IT requesting the backup.
I get a response from IT that it is copying to my shared folder.
The database is copied, so I take it from my share onto my local machine (SQL Server Management Studio won't browse network drives that I've found.)
Once it is local, I can restore it. Unless it is encrypted, then I have to copy it to yet another server where encryption is running.
Once it is up, I have to run multiple scripts on it to setup permissions to give myself access.
Finally, I set a permissions file for PHP to tell it where to access it.

The above takes anywhere from 30 minutes to 4 hours. Per database. It kills my productivity. I can't continue my work until I get the database, so I end up reading around on the internet a lot, which makes me feel like a slacker, which I don't like. But I've been unable to come up with a solution to the problem. I've suggested to IT to give us a server in which we can pull our own databases, where everything can be done locally without the need for copying these things all over the place.

The other speed hit is that Im running on a gigabit NIC, yet somewhere in the router chain to all of the other machines I need to copy to/from, there is a 10mbps switch, which "dumbs-down" the entire process. This too, drives me nuts.

That's one threat to my productivity. The other is how my company is tracking bugs. Right now, they're using two individual systems (they are trying to merge to one, but it's a slow process). They are using SalesForce primarily, and are currently working to prepare an installation of Sugar to use instead. The problem is, updating each entry for bug tracking is currently being done half-assed. So when I tell $boss2 that I need more work to do, she hands me a list of bugs from SalesForce, half of which are invariably already assigned to someone else, or already in QA for testing, or something. Worse, the customer service people are typically the ones opening tickets. They are in the habit of copying and pasting emails directly from clients, which forces me to do their jobs, which is actually finding out if a bug exists. The ticket should not be crossing my desk until the bug is identified and the solution decided on. Because of all of this, I spend more time trying to figure if there actually is a problem, instead of fixing them.

Once I do prepare a fix to a bug, even though we have CVS and SVN servers running, the code doesn't go into those ... it goes into an email to QA, where they do $deity-only-knows what with them. So once it leaves my computer, I never see it again until its time to roll out a release. Now, I may have my facts wrong about how CVS is supposed to work, but I was under the impression that when you check-in, you keep old versions so that each iteration can be "un-done". This would keep us from having to build whole major versions from multiple sets of code fixes each time we want to put out a patch, or make it easy to revert a change if we didn't like it. It would also ensure we have the latest bug fixes each time we pull code to work on. But like I said, I may have my facts wrong.

Im not sure how to fix these problems, as I don't have a lot of control over the process before it gets into my hands. But they frustrate the living hell out of me, and cause me to waste untold amounts of time.

Coding with the new guy in mind ...

It occured to me, recently, that the professors who shoved buzz-words like "re-usabilty," or "maintainability," or even [shudder] "documentation" onto us repeatedly and often in college may have been on to something. I have just up and left my home of 12 years in Lafayette, LA to take a job in Shreveport, LA where I am currently encumbered with someone else's [poorly written] code. Suddenly, things are starting to make sense in a way they couldn't have made sense before now.
I've been coding since I was 17, when I borrowed a C++ book from a co-worker and that was, as they say, that. Not long after, I set about enrolling in college (University of Louisiana at Lafayette), where I entered into an extensive study of all things mathematical. (I always found it ironic that, while I am incredibly good with programming languages and programming in general, I have never been very good at math. Especially since, when you get right down to it, computer science is just fancy math. I digress, however, as that is a topic for another blog entry.) From day one, "maintenance" was the word of the day. However, due to the nature of things at that level, most programming was a solo affair. It is incredibly easy to maintain something you have written yourself (hopefully), where you possess the "broad view," as it were. You know how the system is supposed to function (again, hopefully), and you know what each piece is supposed to do. It is easy to go into code you have written from #include to return 0; and make changes, because you know all the sub-sections of code that are dependent on that piece. Since the vast majority of the work you do for classes is done by you, the lessons imparted by the professors, while necessary, don't quite click like they should until you've spent two months digging around in foreign code.
I took this job with the hope of the naive, thinking it would be a fantastic opportunity to flex my creative muscles and dig into the technology and use it to build things that I could be proud to say, "That's my work, right there." Well, I was partially correct. It has definitely been an education, but the lessons learned were not quite what I thought they were. So far, I've learned about 16 thousand ways NOT to write software. I've come across VERY few things done well, either. So, I thought I would share a couple of the lessons for the future that I've had to learn with anyone who might happen to cross my path here. A couple things that would have made my life here infinitely easier. My work right now is with PHP / MSSQL, but most of the following applies to computer coding in general.

The Art of Mis-Direction
The longer the chain of function calls I have to trace, the more annoyed I get, and the more time I waste. Functions which call other functions, temporary tables that get accessed once, temporary table chains created through 15 different functions in 15 different files all create havok in your code and are often unnecessary. Keep things simple, and keep related items close together.

Do one thing, do it well
There is an old saying, "Brevity is wit." Not sure where its from, not sure of its original context, but it applies well to coding. Considering that speed is usually our top concern with computer programs, it is often a best practice to adopt a minimalist approach to coding. Make your code do as little as possible, as clearly and simply as possible, to achieve your goal. Bloat is the #1 killer of software (listening, Micro$oft?), and is the hardest defect to fix. If you adopt a minimalist approach from the start, you save yourself many headaches down the road.

Consistency in naming
Class A has member function insert_db. Class B has member function db_insert.
$varname vs.$varName vs. $Varname vs. $VarName vs. $Var_name vs. $Var_Name vs. $var_Name vs. $var_name ....
Consistency makes everyone's life better, and I don't have to spend so much time double checking the format of the variable being used this particular time.

Class Methods
I should not have to dig through 3000 lines of code trying to take an inventory of the methods that a class exposes. Do me a favor and place a nice, simple, concise dictionary of functions / variables that are available for me to use in the comments at the top of the file. This way, i can go in and say, "Oh, it does have a SaveToDB() function, and it returns a string which will tell me the success or failure of the save." The faster I can find information, the less annoyed I am, and the faster I can get my work done.

Whitespace means something
Perhaps my #1 pet peeve when digging through someone else's code. White Space matters! It matters alot. White space is not there for the benefit of the computer / compiler / interpreter. It is there for the human. The reader of the code. (Unless you are working in Python, then the interpreter expects the whitespace). white space in a computer program is like a lack of punctuation and capitalization in english you can figure out what is being said eventually but it is annoying it makes you stop to think about the form of what you are reading instead of the content and it just plain looks ugly there are reasons we have conventions and rules it makes life easier on everyone (I *was* going to remove ALL the white space from that, but I can't bring myself to be quite that cruel.)

Comment everything
I don't know why you did something, but I can see from the code that you clearly did it on purpose. Was there something down the line that required this time code in this particular format, even though every other time code has a different format? I don't know, because you didn't leave me any clues. Documentation is not for you. Documentation is for the poor bastard who has to clean up your mess.

...

Those are the ones I can think of for this week. When I think of more, Ill add them to the list.

 1 2 Next →

About

I'm just an average guy engaged in the eternal struggle with apathy. And losing.

User