Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't recognize Thunderbird folders recursively #2

Open
calltheninja opened this issue Sep 20, 2016 · 10 comments
Open

Doesn't recognize Thunderbird folders recursively #2

calltheninja opened this issue Sep 20, 2016 · 10 comments

Comments

@calltheninja
Copy link

calltheninja commented Sep 20, 2016

Tried pointing it at Local Folders and various .sbd folders, it connects to gmail, then there's no output, and it stops in about 2 seconds. Problem is that Thunderbird doesn't add .mbox to the end of the file names. When this is fixed, it works recursively, but doesn't seem to understand the folder hierarchy beyond one level, dumping a bunch of messages into parent folders.

If you're reading this and trying to use this script on linux, here's a helpful bash one-liner to rename all the files without extensions to .mbox. If it renames a file that isn't an mbox file, no big deal, the script will just skip it. Obviously change path to your directory

find /path -type f -not -name "." -exec mv "{}" "{}".mbox ;

Thanks for the script, it's really useful for large mbox files, but not so much for complex hierarchies.

@rgladwell
Copy link
Owner

@calltheninja could please you detail the command line arguments you passed to imap-upload? Do you have other environment details, like operating system, etc?

@calltheninja
Copy link
Author

calltheninja commented Sep 21, 2016

@rgladwell thanks for the response, wasn't sure if this was actively maintained or not. To fix the original problem I was using this script for, I ended up going a different route, but I'm happy to provide some info for you to debug with.

Using Ubuntu 14.04 x64, with Thunderbird 45.2.0. I had some email in Thunderbird in Local Folders I was trying to upload to a gmail account. I did this while Thunderbird was closed.

Here's the command line I was using:
python imap_upload.py --gmail --box="UPLOAD" --user=user@gmail.com --password=password --retry=5 "/home/.thunderbird/profilex/Mail/Archives.sdb/"

I had a folder structure with several deep hierarchies like the one attached. These aren't the actual names, just examples. For most things I would gladly provide you with a copy of the folders themselves, but this was my client's data so it must remain confidential.
github.txt

In this example, bank.mbox is stored in the same directory as bank.sbd. Thunderbird just stores "folders" as mbox files unless they have subfolders, in which case it creates a "folder.sbd" at the same level in the file hierarchy, and puts the folders deeper in the hierarchy inside that folder.

imap-upload would make the first folder "bank", upload the bank.mbox emails into it, make another folder "chase", upload the chase.mbox emails into that, but anywhere deeper in the hierarchy like "Statements" would just get put in its parent folder. So statements.mbox would be put in B of A or Chase, I don't remember which.

Some of the mbox files and folders had spaces or commas in them, I made some test folders + emails to see if that was the problem, but it didn't fix it. You should be able to replicate the problem by:

  • Creating a new thunderbird profile
  • Creating a similar hierarchy and inserting some sample emails
  • Closing thunderbird
  • Renaming all the mbox files in the Thunderbird profile directory to have an .mbox extension
  • Pointing imap-upload at the top of the folder hierarchy

@rgladwell
Copy link
Owner

To enable the recursive feature you need to pass -r to imap_upload along with the path you want to scan, like so:

python imap_upload.py --gmail --box="UPLOAD" --user=user@gmail.com --password=password --retry=5 -r  "/home/.thunderbird/profilex/Mail/Archives.sdb/"

But there does seem to be a bug that we should treat all files as mbox as we encounter them, unless they fail parsing, rather than rely on scanning the file suffix as we currently are.

@calltheninja
Copy link
Author

calltheninja commented Sep 21, 2016

Aah yes I apologize. I pulled that command from my bash history, that was when I was doing some testing to get it working, I had written a script to find all the mbox files and import them by calling imap-upload separately for each file and give it hard-coded paths for the IMAP destination. I was using the -r flag previously which the bug report accurately describes the behaviour of. I never finished the script though.

rgladwell added a commit that referenced this issue Sep 21, 2016
Fixes bug where we are incorrectly detecting MBOX files by trying to
parse the file suffix. Not all MBOX files use this convention, for
example in Thunderbird.

#2
@rgladwell
Copy link
Owner

When using the recursive feature do you see any error messages from stdout/stderr? Does it just stop parsing?

Also I uploaded a fix for the issue of having to rename Thunderbird mbox files, can you take a look at the thunderbird-support branch and let me know if this works better for you?

@calltheninja
Copy link
Author

calltheninja commented Sep 21, 2016

Nope, the output looks correct, not sure if it got the destination folder correct in the output. This would be a really useful tool for me if it was able to import Thunderbird profiles. Copying massive amounts of mail with Thunderbird or other tools has proved unreliable for me, and it's a somewhat frequent task in my line of work. I've used imapsync before, but that requires setting up a temporary imap server if you want to take mbox files and put them on an imap server. Any chance you know any other tools for this job? I'll run some tests tomorrow with the thunderbird-support branch using some files/folders I can send you copies of.

Can you email me at alex{at}calltheninja.com so I can reply privately with a bug report and zip of those folders?

Thanks so much for your help!

@calltheninja
Copy link
Author

When running thunderbird-support branch:
python imap_upload.py -h
File "imap_upload.py", line 246
elseif:
^
SyntaxError: invalid syntax

@rgladwell
Copy link
Owner

@calltheninja I fixed the syntax error but I'm not sure how much more I can assist: I'm not really set-up to provide commercial support for this tool and I can't really accept confidential, private emails even for testing purposes.

I'm not aware of any other tools that do the same thing either, 'imap-upload' is probably the easiest and most user friendly one out there.

If you have the time to run some tests yourself you could highlight the area of the problem with recursively searching sub folders and I might be able to fix that. Alternatively, all PRs are gratefully considered.

@calltheninja
Copy link
Author

I understand, just trying to help you debug this : ). I can send you the emails (Thunderbird folder) for my test account, they won't be super-confidential. Just send me an email so I can run the tests/send you the results.

@rgladwell
Copy link
Owner

LOL I thought I was the one helping you?

Can I ask if you tried the updated branch yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants