Match Strategy Issue

Feb 25, 2009 at 7:05 PM
I am having issues with the Match Strategy algorithms. No matter which one I pick it does not download all of the pictures from Flickr. Since files are downloaded using the picture's Flickr id can you just match on the id itself instead of introspecting the date taken or title? I would think that this would guarantee that all pictures are downloaded and metadata applied directly. Thank you.
Feb 26, 2009 at 9:36 AM
FlickrMetadataSynchr is not primarily a download tool. Normally the files already reside on a local drive and don't have any Flickr id information. That's why pictures are matched on date taken or title (because most upload tools set the title on Flickr equal to the filename).

Before investigating further if it would be possible to do something with the Flickr id, I would like to understand your issue. Looking at the code, my tool should download all unmatched pictures from Flickr in the selected set if you check the download option. They get the metadata directly from Flickr, so no further matching is necessary.

In short, I do not understand your issue yet.
Feb 26, 2009 at 2:29 PM
First of all, I am using it primarily as a download tool. This tool is critical to me for pulling down my photos from Flickr for backup purposes. I have run into issues with several of my photosets where the number of pictures downloaded is less that the number of pictures stored in Flickr. For example, I have a photoset that has 65 pictures but it downloads only 55. I see in the activity log that reads metadata from the 65 files but it ignores 10 of the files because the  match keys are the same. I have selected the "Date taken, title/filename" match strategy. After a bit of digging, it looks like the match key is only looking at the date taken vs concatenating the date taken and the title/filename. I manually looked at all the pictures and they have unique file names in Flickr.

BTW... I nice feature would be if the downloaded filename is set as the picture's title vs. the flickr id.

Also, I would like to donate to this tool if we can get these issues resolved. This is a great utility.

Feb 26, 2009 at 3:23 PM
Thanks for the clarification. Now I understand the issue.

As it is, the "title/filename" part in the "date taken, title/filename" strategy only works if the pictures haven't been matched based on date taken. Having multiple images with the same date taken (which can happen if you take more than one picture per second) is indeed a problem for the current version of the tool.

I'll look into the option of using the title/filename or Flickr id as an additional discriminator when pairing up images. And when downloading images, it shouldn't be a problem because the local picture has to be created anyway.

In short, it sounds like a solvable problem. Don't know how long it will take me, because spare time is a limiting factor. Any encouragement will help ;)

With regard to your feature suggestion. Setting the filename equal to the picture's title is difficult because filenames are not allowed to contain certain characters like / and : that may occur in titles. Also filenames have to be unique and titles not.
Feb 26, 2009 at 5:03 PM
I am happy to hear that it is solvable and I would be willing to help out ;) (Where do I do that?)

I am also running into a few cases where the utility completely crashes when trying to set metadata. The log reads:

FlickrMetadataSynchr.exe Information: 0 : Updated author for local picture with id C:\...\2632085539_6a6ece21c4_o.jpg from '' to 'Gary'.
FlickrMetadataSynchr.exe Error: 0 : Unhandled exception occurred. The application will shut down. Exception details follow:
FlickrMetadataSynchr.exe Error: 0 : System.ArgumentException: Property cannot be found. ---> System.Runtime.InteropServices.COMException (0x88982F40): Exception from HRESULT: 0x88982F40
   --- End of inner exception stack trace ---
   at MS.Internal.HRESULT.Check(Int32 hr)
   at System.Windows.Media.Imaging.BitmapEncoder.SaveFrame(SafeMILHandle frameEncodeHandle, SafeMILHandle encoderOptions, BitmapFrame frame)
   at System.Windows.Media.Imaging.BitmapEncoder.Save(Stream stream)
   at Yorrick.FlickrMetadataSynchr.Local.LocalPicturesHelper.WriteCopyOfPictureUsingWic(Stream originalFile, LocalPictureMetadata metadata, String outputFileName) in C:\Sources\FlickrMetadataSynchr\Main\FlickrMetadataSynchr\Local\LocalPicturesHelper.cs:line 658
   at Yorrick.FlickrMetadataSynchr.Local.LocalPicturesHelper.UpdatePictureMetadataUsingWicWithCopy(LocalPictureMetadata metadata) in C:\Sources\FlickrMetadataSynchr\Main\FlickrMetadataSynchr\Local\LocalPicturesHelper.cs:line 553
   at Yorrick.FlickrMetadataSynchr.Local.LocalPicturesHelper.UpdatePictureMetadataUsingWic(LocalPictureMetadata metadata) in C:\Sources\FlickrMetadataSynchr\Main\FlickrMetadataSynchr\Local\LocalPicturesHelper.cs:line 467
   at Yorrick.FlickrMetadataSynchr.Local.LocalPicturesHelper.<UpdatePictureMetadata>b__0(Object o) in C:\Sources\FlickrMetadataSynchr\Main\FlickrMetadataSynchr\Local\LocalPicturesHelper.cs:line 446
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Boolean isSingleParameter)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Boolean isSingleParameter, Delegate catchHandler)
   at System.Windows.Threading.DispatcherOperation.InvokeImpl()
   at System.Windows.Threading.DispatcherOperation.InvokeInSecurityContext(Object state)
   at System.Threading.ExecutionContext.runTryCode(Object userData)
   at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Windows.Threading.DispatcherOperation.Invoke()
   at System.Windows.Threading.Dispatcher.ProcessQueue()
   at System.Windows.Threading.Dispatcher.WndProcHook(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Boolean isSingleParameter)
   at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Boolean isSingleParameter, Delegate catchHandler)
   at System.Windows.Threading.Dispatcher.InvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Boolean isSingleParameter)
   at System.Windows.Threading.Dispatcher.Invoke(DispatcherPriority priority, Delegate method, Object arg)
   at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
   at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
   at System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)
   at System.Windows.Threading.Dispatcher.PushFrame(DispatcherFrame frame)
   at System.Windows.Threading.Dispatcher.Run()
   at System.Windows.Application.RunInternal(Window window)
   at System.Windows.Application.Run(Window window)
   at System.Windows.Application.Run()
   at Yorrick.FlickrMetadataSynchr.App.Main() in C:\Sources\FlickrMetadataSynchr\Main\FlickrMetadataSynchr\obj\Debug\App.g.cs:line 0

Any ideas? Anything I can send you to help you debug?
Feb 27, 2009 at 5:07 PM
I could send you a link to my Amazon wish list as a way to encourage me to quickly fix the issues you encountered ;)

The new issue in the stack trace indicates that the Windows Imaging Component isn't able to update the metadata of the picture. Could be a bug in WIC (which is not under my control), or some sort of corruption in the metadata in those files. 

Do you have .NET Framework 3.5 SP1 installed? If not, installing it might update WIC to a newer version.

If you feel really lucky, you might try my app on Windows 7 Beta. According to this blog post many WIC bugs have been fixed in Windows 7:

The only thing I can offer in my application is to "handle" this error by making my app more resilieint, i.e., just ignoring the error, noting it in the log and continuing with the next picture. The file in question will most probably not be updated.

Feb 27, 2009 at 5:16 PM
Please send me that Amazon link ;)

I have not upgraded to .NET 3.5 SP1 yet but I will try to see if that fixes it. In the mean time if you could handle that error and move on to the next picture that would be handy.

So as a summary the features requested are:

1. Enhance match strategy algorithm to ensure the downloading of all pictures
2. Handle WIC bug gracefully
3. Save pictures with the title from Flickr (turn all "bad" windows characters to dashes or something)
4. Show the number of photos in each set next to each album name in the drop down (nice to have)
5. Allow the syncing of all photosets (nice to have)

The first 3 would be huge. I will "encourage" you to start and if things look good I will "encourage" you some more ;) ok?
Feb 27, 2009 at 6:22 PM
Great! Deal.

This is the link to my Amazon wish list:
Feb 27, 2009 at 9:53 PM
I see you are going to Barcelona ;) My wife and I visited there last year. We had a great time. Make sure you get out of the city and visit Montserrat. There is a beautiful museum in addition to the church which is amazing.
Mar 2, 2009 at 7:05 PM
Thanks for the Monserrat tip.

Consider item 1-4 work in progress ;) Can you send me the URL of a picture that causes the WIC metadata update exception that crashes the app?
Mar 4, 2009 at 6:49 PM
Thank you and I appreciate the help :)

The offending picture looks to be:

Mar 4, 2009 at 7:49 PM
This is indeed an offending picture, I can reproduce the metadata update error that you get. The app now handles this as a "normal" error and doesn't crash. So #2 on your list is fixed.

I've also managed to add features #3 and #4 to my app. I think #3 will also serve as a work-around for issue #1 if you choose "title/filename" as match strategy. I'll look into fixing #1 in a better way in the near future.

In the mean time, you can try a first preview of version by downloading it from my Windows Live Skydrive (this link will work until a newer version is released).
Mar 4, 2009 at 8:02 PM
Thank you  for the update. Any idea what is "wrong" with that picture?

I downloaded and tested the new app and #1 is still an issue because I have pictures with the same name. There is no way to concatenate the date taken with the title to get a more unique key?
Mar 4, 2009 at 8:11 PM
Unfortunately I have no idea what is "wrong" with that picture.

> There is no way to concatenate the date taken with the title to get a more unique key?

There might be, but it will take considerably more time to develop. 

Your public pictures seem to have the filename as title. And you stated earlier "I manually looked at all the pictures and they have unique file names in Flickr". So I supposed it would be possible to match them on a title/filename basis. If you choose the download option, they will now get the title as filename.
Mar 5, 2009 at 12:56 PM
I didn't realize but on some of our albums my wife did tag many of the pictures with the same name :(  Anyway you could work on that advanced matching logic? It would be much appreciated.
Mar 5, 2009 at 11:47 PM
Edited Mar 5, 2009 at 11:49 PM
I've now significantly enhanced the matching algorithm to deal with pictures with the same datetime taken. Check out the explanation on work item 9393.

A second preview of version is now available for download on my Windows Live SkyDrive (link will work until a newer version is released). Please give it a try.

DISCLAIMER: You might want to try running "simulation mode" first and carefully check the activity log for what is happening!
Mar 6, 2009 at 12:54 PM
Overall looks awesome but I have noticed a few things:

1. Using "Smart match", the first time I do a download I get all the pictures (great!)
2. I run it a second time and it adds the "flickr:id=...." tag to every picture and updates the description (Updated description for Flickr picture with id 1988075260 from '' to '                                    '.)
3. I run it a third time and still tries to update the description

So why does it add the flickr:id in the second pass? Is it possible to add the flickr:id as a custom meta tag so as to not pollute the tag list? Lastly, I think you may need to do a trim on the description when you compare the file on disk to the online version to avoid the description being updated every time for no reason.

Hopefully this makes sense and thank you for all your help with this.
Mar 6, 2009 at 1:09 PM
Good point on the "trim". It looks like your local picture somehow has a lot of spaces in the description. I have never noticed this problem myself. I think it should be one time issue though. After a real sync (not after simulation of course ;) both sides have the spaces in the description so no further sync should be needed. Do you have a URL for this picture so I can reproduce the issue? 

The addition of the 'flickr:id=<value>' tag is optional. There is a setting in FlickrMetadataSynchr.exe.config file to enable/disable this. Open it in Notepad (or even better, an XML editor) and look up 

   <setting name="FlickrIdTagBehavior" serializeAs="String">

Supported values are AddOrUpdate, Remove or DoNothing. I might surface this setting in the UI in future versions. If you don't like this behavior, you can set it to Remove and sync your pictures again. The tag will be removed for both the Flickr picture and the local picture.

I felt this addition was the most robust way to ensure that images can be succesfully matched in the face of situations with duplicate datetime taken and titles. I don't think there is a way to hide tags for local pictures. On the Flickr side it is a so-called machine tag. Flickr treats these as special and allows you to hide them.

It is not absolutely necessary to have this tag on the Flickr side, but the synchronisation of tags is easier if it exists on both sides.
Mar 6, 2009 at 1:13 PM
As further clarifaction: during download no metadata is changed on the Flickr side as there are no local pictures when the sync is started. The local picture does get the flickr:id=... tag though if FlickrIdTagBehavior=AddOrUpdate. During a second sync pass, when syncing tags, the app notices the Flickr side doesn't have the flickr:id=... tag yet and adds it.
Mar 6, 2009 at 2:38 PM
So does Flickr store the flickr:id by default in the tags? If so, isn't it risky to remove those tags?

So the description bug does happen on every sync because I believe Windows is trimming the description but the Flickr images has all the spaces so it is a mismatch every time. Here is a sample image (from Barcelona :) ) :

Mar 6, 2009 at 3:30 PM
The flickr:id=... tag is a custom tag that isn't used by Flickr itself. There is no risk in removing it.

I can repro the description = '              ' sync issue with your picture, so it will be fixed.
Mar 6, 2009 at 7:51 PM
The description='         ' sync issue is fixed. In fact it was reported in 2007 already, but I overlooked it! Could you give Preview 3 a try? (link valid until a new version is released)
Mar 7, 2009 at 4:59 PM
Preview 3 is looking really good! I have been able to pull down all of my pictures using the "Smart match" and "DoNothing" for the Flickr id param. The description bug also looks to be solved. The only thing I noticed was for the images that fail to update their metadata an ".output" file is left in the directory. Not a big issue.

On a side note, do you know why Google's Picasa does not read the image tags?

Mar 7, 2009 at 6:18 PM
I've rethought the flickr:id=<id> tag idea, because indeed it clutters up the tag space. So I dropped it and replaced it with something better. I now use the ImageUniqueID metadata that can be added to both EXIF and XMP metadata to store the Flickr id in the local picture. It is invisible to most applications.

Please give Preview 4 a try (link valid until a new version is released).

Since preview 1 no .output files should be left if the metadata cannot be updated, because if the app doesn't crash such a file is deleted when an exception occurs. If it does somehow leave a file (which would be a bug) it would have the ".out" extension instead of ".output". Maybe it was left over when the version crashed?

Don't know about Picasa. I am a Windows Live Photo Gallery user.
Mar 8, 2009 at 12:13 AM
Much cleaner solution using the ImageUniqueId. Keeping the tag space clean is critical to organization.

I will do more research on the .output issue and get back to you with ways to recreate it. Thank you again for your attention to detail.
Mar 10, 2009 at 10:48 PM
Thank you very much for the incentives Gary. I've received them today. Should come in handy when I visit Barcelona around Easter.
Mar 20, 2009 at 12:48 PM
Hi Erwyn-

I was working with the tool a bit more last night and I noticed that it crashes when dealing with GIFs when trying to update the metadata. I assume your tool only works with JPGs. Also, for videos it pulls down a snapshot of the video but not the video itself. 
Mar 20, 2009 at 4:39 PM
Hi Gary

Can you open two separate issues through the issue tracker for this? This "match strategy issue" thread is already way too long ;) 

My intent is to let the tool only handle JPEG files because updating metadata in GIFs and video files is impossible or a downright nightmare. I haven't tested the download option against video and GIF files, so I can imagine that it doesn't properly skip them and crash.

Have a nice weekend.