Troy Forster

Musings on technology, environment and adventure

Pages

Recent posts

Tags

Categories

Navigation

Archive

Blogroll

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

Isolating File Extensions Using Regular Expressions In SQL Server 2000

One of our our older applications is still running on SQL Server 2000 -- Yes, I know, it's three revisions and nine years behind but I am told there is a valid reason why it hasn't been upgraded yet.  And no.  I don't know what that reason is -- While the application has it's own dedicated support team I am sometimes called upon to lend assistance when an issue requires a more experienced skill set.  Today was one of those days.

One of the database tables is used to store virtual references to files that end users have uploaded.  Unfortunately there is no validation applied to the filenames and users are free to upload whatever they want.  Although this is fine for files being uploaded it can present problems for subsequent GETs.  If a file has a badly named extension then no appropriate mimetype can be determined.  With no appropriate mimetype a web browser will simply present a download dialog box instead of opening the file with a registered handler or plug-in.  It should be noted that this is not the fault of the end user.  To them it makes perfect sense to name a video file named mymovie.mov as mymovie.mov copy. 

My main task was to parse hundreds of thousands of table rows to get a count of all the unique file extensions in use.  With that information we could determine the scope of the problem and prioritize the work effort required to fix it.  My first thought of course was to simply use the CLR to create a wrapper around a regular expression.  Facepalm! This application is running on SQL Server 2000, ergo no .NET, ergo no CLR, ergo "now what?"

I had a faint recollection of doing something almost a decade ago with SQL Server 2000 and COM Automation.  After some digging through Google as well as my own memory I came across sp_OACreate() and the companion sp_OASetProperty(), sp_OAMethod() and sp_OADestroy().  These stored procedures allow communication with COM and subsequently the VBScript Regular Expression object.  Creating an instance of RegExp in a stored procedure using dynamic SQL is as simple as

     1:
    DECLARE @ObjectId Int
     2:
    DECLARE @ReturnVal Int
     3:
    EXEC @ReturnVal = sp_OACreate 'VBScript.RegExp', @ObjectId OUT

There exists many examples on the web for implementing Regular Expressions this way.  I rolled my own proof of concept so that I made sure I fully understood how the sp_OA* stored procedures work then resorted to Google to locate a finished solution.  See http://www.simple-talk.com/sql/t-sql-programming/tsql-regular-expression-workbench/ for a great implementation of User Defined Functions (UDF) supporting matching, finding and replacing.

Once the UDFs were added to the SQL Server 2000 database it was time to work on the query that would return all the unique extensions.  That was accomplished with a simple cursor that called RegExFind on each iteration.  The RegEx pattern was \.[\w\s\d]*.$ and it was applied to the file path obtained from the table containing the file references.  The query is reproduced below for anyone interested.

The pattern \.[\w\s\d]*.$ means match all characters between the last period in the path (assumed to be the extension delimiter) and the end of the line.  Referring to the example earlier this would match the .mov in mymovie.mov as well as the .mov copy in mymovie.mov copy.

     1:
    DROP
    TABLE
  
     2:     #Extensions
     3:   
     4:
    CREATE
    TABLE
  
     5:     #Extensions (
     6:        Extension VARCHAR(255)
     7:     )
     8:   
     9:
    DECLARE MY_CURSOR CURSORFOR
    10:
    SELECT
  
    11:        filePath
    12:
    FROM
  
    13:        FilesTable
    14:
    WHERE
  
    15:        FilesTable.[Type] = 10
    16:
    AND filePath ISNOTNULL
    17:   
    18:
    OPEN My_Cursor
    19:
    DECLARE @VAR1 VARCHAR(255)
    20:
    DECLARE @Ext  VARCHAR(50)
    21:   
    22:
    FETCH
    NEXT
    FROM
  
    23:        MY_Cursor
    24:
    INTO
  
    25:        @VAR1
    26:
    WHILE(@@FETCH_STATUS <> -1)
    27:
    BEGIN
  
    28:
    IF(@@FETCH_STATUS <> -2)
    29:
    SET @Ext = (
    30:
    SELECT
  
    31:
    TOP 1 value
    32:
    FROM
  
    33:                 dbo.regexfind('\.[\w\s\d]*.$', @Var1, 1, 1)
    34:
    WHERE (
    35:                 LENGTH > 0
    36:              )
    37:           )
    38:
    IF (
    39:
    SELECT
  
    40:                 Extension
    41:
    FROM
  
    42:                 #Extensions
    43:
    WHERE (
    44:                    Extension = @Ext
    45:              )
    46:           ) ISNULLBEGIN
    47:              INSERT INTO
    48:                 #Extensions (
    49:                    Extension
    50:                 )
    51:
    VALUES (
    52:                    @Ext
    53:                 )
    54:
    END
  
    55:
    FETCH
    NEXT
    FROM
  
    56:           MY_CURSOR
    57:
    INTO
  
    58:           @VAR1
    59:
    END
  
    60:
    CLOSE MY_CURSOR
    61:
    DEALLOCATE MY_CURSOR
    62:   
    63:
    SELECT
  
    64:
    DISTINCT Extension
    65:
    FROM
  
    66:     #Extensions
    67:   

Now that the analysis query has been run we are evaluating the results to determine the best remediation solution.  It will probably come down to a replacement of obvious malformed extensions with the correct extension.  I will use a variation of the query above substituting RegExFind with RegExReplace. 

One important note when using this method is that is quite costly in terms of performance.  In our case it is being applied as part of a one time data maintenance and cleanup operation.  I would not recommend using COM Automation like this in a high throughput stored procedure without careful thought first.  At this stage it is probably well worth upgrading to SQL Server 2008 and using the CLR.

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Feb 17 2009, 14:12 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

I'm In Love With jQuery

Me: I'm in love with jQuery!

Ali: What will Prototype say?

Me: We won't tell Prototype.

Ali: She'll get pretty upset after what you did to Dojo!

The first javascript library I used was Dojo 0.4.1 which I thought was awesome at the time.  Then I discovered the more pure syntax of Prototype and script.aculo.us and dumped Dojo.  I was really enjoying Proto-taculous when I decided to check-up on jQuery and was blown away by it's simplicity and power. 

At the moment jQuery seems to be the leading javascript library but it's not without competition.  I have to remember to keep an eye on the other leading libraries because they're all innovating at a fast rate.

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jan 29 2009, 14:07 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Idea A Day #5 -- The Ultimate Travel Mug And Friend Of The Environment

Fast food coffee shop cups such as Tim Hortens account for a significant amount of annual landfill content.  Even if they were recyclable the cradle to grave energy cost to manufacture, transport, collect, transport and recycle is still high.  We could reduce that cost dramatically if everyone used their own cup as much as possible.  While you do see some people doing this they're often inappropriately associated with a sub-culture such as starving students or weird hippies.  Carrying a personal cup should be socially acceptable and encouraged.  To achieve this social acceptance we need to make using a personal cup easier and provide incentives. 

A prime incentive, that of money, already exists but it's not openly promoted.  I know that Second Cup, Tim Hortens and Starbucks all provide discounts to customers who bring their own cup.  The discount is small, about the equivalent of sales tax.  It may not seem much on a single purchase but it could save $25 a year -- 10c per day * 5 days per week * 50 weeks (coffee drinker was on holiday for 2 weeks).  The discount is probably not promoted because it means a customer is walking around town without displaying the shops logo. 

Enter the Ultimate Travel Mug! 

  • It's double-walled stainless steel for durability and heat retention. 
  • It has an industrial strength stay closed lid with a short plastic strap that can never be separated from the cup and lost.
  • A small groove between the handle and the body conceals a  reusable stir stick because millions of disposable stir sticks also have an environmental impact!
  • The handle incorporates a carabineer style snap closure so the mug can be attached to a backpack, briefcase, etc.
  • An inset is provided so that coffee shops can easily attach their own branding or logo.  Buy the cup from Starbucks and it says Starbucks to everyone you pass in the street.
  • A second smaller inset has an indelible writing surface for personalization thus preventing "Hey, get yer hands off my cup!" syndrome .
  • An integrated RFID allows the user to "charge" their cup with pre-paid coffee credits (similar to Second Cup's Coffee Card) so that subsequent visits don't require digging for change.
  • It has no restrictive patent or license meaning that anyone can manufacture and sell it. 
  • It is unveiled to the world as part of a cooperative campaign shared by all the major fast food coffee shops who pledge to eliminate 99% of their disposable cups within one year.
  • The campaign also pledges to use 100% fair trade, shade grown beans using sustainable practices.
  • It's dishwasher safe.
Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jan 29 2009, 14:06 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Idea A Day #4 – WKRP In Twitterville - All Request Streaming Radio

I had this idea a few weeks ago and it prompted me to start exploring Google's App Engine as a result (which I like a lot, but that's another post). 

I've been using Last.FM for quite some time now and really like it as a radio station for discovering new music.  I start by adding an artist or tag that I already like then just sit back and listen.  The problem with using the "like an artist" and "like a tag" method is that the radio stream is still going to be somewhat constrained to a narrow and vertical sound.  There could be all sorts of related music that is just beyond the edge described by the meta-data of the current station.  This is were people come in.  People can offer suggestions that computers can't (ooh, I'll regret that statement).

Now I'm thinking of a small app probably with an invisible embedded Flash player that consumes a playlist from Last.FM.  The app has a number of genres, or stations, that the listener can tune into.  Without outside influence the app will simply play the Last.FM stream as the API defines it.  However, the app server is also wired to the Twitter API and listening for tweets sent to a specific user created for this purpose, maybe @DJ although I suspect that's been long taken.  @DJ starts receiving requests for different artists and singles which it looks up in the Last.FM API to ensure broadly matches the genre.  After all, we don't want a Rhianna song following ACDC and Kiss.  The app simply keeps a count of the requests favouring the most requested over a given period of time injecting the most requested into the playlist.

As an exercise in API's, Google App Engine, etc this is a fun project (yes, it's gone a bit beyond the idea stage already).  However, it does have one possible flaw, namely that of Twitter reliance.  After all, if the listener has the app then they also have an interface capable of logging music requests to be sent back to the server.  So why use Twitter?  Well, I'm working on that one.  Perhaps this can be extended into a service that can be resold to terrestrial radio stations?  But that's another idea

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jan 26 2009, 13:42 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Idea A Day #3 -- The Reflecting Clock

Simply put this is a clock whose hands travel anti-clockwise past reversed numbers. 

Have you ever sat in front of a mirror while getting your hair cut and saw a clock on the opposite wall?  It's backwards because you're looking into the mirror. 

A clock like this would make a great conversation starter.  It would also have great practical use in live theatre and show business in general.  Actors sitting in front of their dressing room mirror can simply look at the Reflecting Clock to see how long till curtain call.

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jan 22 2009, 14:10 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Idea A Day #2 -- Use A Juicer To Make Vegetable Stock

It all started with a craving for French Onion Soup.  Most recipes call for beef stock and as a vegetarian for the past 20 years this isn't an option.  I've made French Onion Soup with store bought vegetable stock and it works quite well apart from the inhumanely high sodium levels.   I have recently bought some cheesecloth and I plan on simmering all my vegetable cutoffs to make and freeze homemade stock.  Unfortunately I haven't got around to that yet and I wanted the onion soup tonight.  So how could I make a low(er) sodium vegetable stock in a hurry?

Use my juicer!  Anyone that has had fresh vegetable juice knows that it is packed with flavour.  The key, I thought, would be in balancing the flavours to get a rich stock-like essence without tasting like something from a juice bar.  Carrot juice is very sweet so I used one small one.  Celery has a milder flavour so 4 stalks went in.  It was going to be an onion soup with caramelized onions but throwing another in the juicer couldn't hurt.  And for good measure I also juiced the bottom halves of 3 broccoli stems (heads to be steamed for dinner tomorrow).

The resulting juice was flavourful and not too sweet.  I started simmering it with a pinch of salt, pepper and a couple of bay leaves.  Meanwhile I caramelized 4 medium thinly sliced onions with some olive oil, eventually deglazing with a cup of left-over red wine.  Some Soy sauce added a little more salinity and helped darken the hue to French Onion-ish brown.  I transferred the caramelized onions to the simmering stock and continued to cook for about another 20 minutes seasoning a bit more along the way.

The final result was not French Onion soup.  But it wasn't unpleasant either.  In fact it was quite good.  Some of the flavours were definitely interesting but that can corrected by playing with the types and quantities of vegetables.  I'm going to keep working on this until I can nail the right recipe.

Now I just have to think of what to do with the pulp that is left over in the juicer.  Perhaps mixing with tofu and seasonings can be the basis for a homemade veggie burger?  Or is this is another idea to be saved for another day?

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jan 21 2009, 14:11 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Idea A Day #1 -- iPhone Web Stack Scratchpad

Actually, idea #1 should be this silly idea of posting an idea every day :)  It seems that every waking hour (and some not so waking hours) has my brain in overdrive coming up with all sorts of ideas.  Ideas for a new recipe to make for dinner, a novel piece of software, a fiction novel, a nifty bicycle drive train, inaugurating a hiking trail in downtown Toronto and so, so much more.  In the spirit of mathematics and programming I declare my silly idea to be Idea A Day #0.  So, without further ado, my second published idea, Idea A Day #1.

I love my iPhone.  I use it all the time although surprisingly enough I make very few actual telephone calls on it.  It works exceptionally well as a tiny internet tablet device and the number of decent applications is impressive.  I find that I'm turning to it more and more frequently for utilitarian task.  So much so that I have been wondering if I can use it for prototyping development ideas targeting the standard web stack of HTML, javascript and CSS.  So far I haven't come across any applications that do this but I don't think that means it can't be done.

There are some obvious hurdles that would have to be overcome right away.  The primary one being that of the UI.  The iPhone, for all it's glory, has pretty poor text editing.  You can move the cursor around with difficulty, there is no select, cut, copy and paste.  As well, punctuation characters always require a switch to a different keyboard layout.  Typing all those angle brackets, braces and semi-colons will get tedious very quickly.

My solution for the text entry woes would be to place a sliding panel that is normally hidden, save for a subtle grabber.  A touch of the grabber slides the panel out exposing an accordion menu of individual characters as well as code snippets.  The menu would take advantage of context and predictive patterns when presenting a dynamic list of suggested options.  Another hidden but readily available panel would display 4-way arrows and a couple of modifier button to make it easier to cursor up, down, left, right, start of line, end of line, page up, page down, cut, copy and paste.  Double-tapping a word would select it.  Double-tap and swipe would select adjacent words.  Triple-tap would select a code block.  Tap and hold would popup a menu of characters after a short delay.  A menu that would contain characters such as <>(){}#;'".  Rolling left to right on the menu chooses the appropriate character.  A double-space would place a smart period but unlike the usual iPhone behaviour it would not leave a trailing space.  Instead the cursor would be flush against the period anticipating a dot-notation requirement.

I've also thought about how the actual edit/run would be implemented.  The entire development environment would live inside an html file that already includes the latest jQuery and Firebug Lite javascript libraries as well as a code syntax highlighter supporting, of course, javascript.  The majority of the html markup would implement an editor plugin and the sliding panels at each window edge.  When the run/debug button is tapped (hmm, I think we need another sliding panel, maybe on the top now) the current code in the editor is saved to a cookie.  This limits the editor to a 4K file.  Now I don't want to be accused of 640K ought to be enough for anybody but I know for me right now 4000 characters is probably more than I want to type into the iPhone in its current incarnation.  For reference, my word counter for this article says I'm at 593.  I'd have to type a little less than 8x what is on this page.  Yeah, 4K may not be enough for anybody but it's enough for me.  Sorry, going off on a tangent there -- The code being edited is saved to a cookie and then an iFrame is opened full screen over the current viewport.  The iFrame uses a small boot loader to read the code from the cookie and inject it into the DOM.  The iFrame would already contain references to jQuery and Firebug Lite.  It would also have sliding panel accessible via a tap on a grabber that allows the user to return to the editor.

In principal I don't see anything wrong with this idea (yet).  Practice, unfortunately, sometimes proves principal to be a waste of time.  I doubt that I will take this idea from the drawing board and into Visual Studio but it has been a great distraction for my brain on the walk home from the office tonight.  I'm sure I'll refine the idea within my own grey-matter based VM a bit more.  In fact, typing this out has already got me thinking that the iPhone Web Stack Scratchpad could even be socialized with cloud based storage for the code block, snippet sharing and programmer collaboration.  Oh, and built-in Twitter SDK support that can be programmed against as well as used to tweet out snippets to fellow developers

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jan 20 2009, 14:27 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Safari On The iPhone Suppots Bookmarklets

I didn't even think about it when I first started exploring my new iPhone but in hindsight it makes perfect sense.  The browser is Safari after all.

Unfortunately the only way I have found so far to get bookmarklets on my iPhone is to install Safari on my desktop, delete all pre-installed bookmarks, add my favourite bookmarklets and synchronize.  It's pretty straightforward but it would be much nicer if I could synch directly with FireFox, my default browser.

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Sep 02 2008, 21:03 by tforster | Comments (1) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under:

Using jQuery to GET an ASP.NET webservice

After an hour or two of sleuthing I've discovered that jQuery 1.2.6 does not set the Content-Type header for HTTP GETs even if you explicitly use the contentType parameter. My understanding of the jQuery rationale is that GETs don't contain data and therefore a Content-Type header is not required. Unfortunately, Microsoft reckons that GETs returning JSON provide a security risk and that a Content-Type header must be specified. Scott Guthrie explains it here http://weblogs.asp.net/scottgu/archive/2007/04/04/json-hijacking-and-how-asp-net-ajax-1-0-mitigates-these-attacks.aspx.  

After reading documentation from both camps I can fully understand their opposing views.  Ultimately it is yet another indicator that the responsibility of user protection lies in the browser is blatantly wrong!  If the browser raised a message asking the user if they are ok with a cross domain request and acted according to the response it would render both jQuery's and Microsoft's approaches obsolete.

So now I am caught between a rock and a hard place.  I love jQuery and I love Microsoft ASP.NET webservices and I am going to use both in my current projects.  But, if I can not explicitly set the Content-Type header for an .ajax() GET then I have just one choice and that is to use POSTs instead.  Unfortunately this contradicts the notion of using the correct HTTP verbs and removes the possibility of a RESTful API. 

It's always about compromise.

Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Aug 19 2008, 17:22 by tforster | Comments (1) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under: c

A Bicycle Lock That Tags The Thief

This is awesome.  I have to get one of these locks although it doesn't appear that the lock is in manufacture yet which is a shame.
The lock contains a pressurized dye that is both visible and detectable with forensic equipment.  The designer's home page can be found at http://www.mikelambourn.com/smartlock.html and I've also linked a YouTube video of the lock in action.
 
Digg It!DZone It!StumbleUponTechnoratiRedditDel.icio.usNewsVineFurlBlinkList
Posted: Jul 11 2008, 15:14 by tforster | Comments (0) RSS comment feed |
  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Filed under: