You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Part of the nature of precomp calls for the frequent use of parsers and file-type recognition code.
This is a tedious task, as every parser needs to be manually written and tuned.
It is also very prone to errors and mismatches because of the different implementations of each standard and the fact that the streams are often only a part of the whole file so the program is flying blind.
I believe that having a robust and accurate universal type detection code is not only posible but probably easier to implement than the current system using the method described in here.
the proposed solution is correctly assigning file types based only on file fragments of size 1024 with an accuracy of 98.3%.
The parsers currently used by precomp are very good in their own right, yet there are a number of future applications where precomp will need more and more detection code, including some open issues:
There are other applications for a quick and correct type detection, as the use of dictionary preprocessors and/or custom compressors for text, exe preprocessing, mm preprocessing, and maybe even fast detection of header-less deflate streams, currently done on "brute mode".
There's also the proposed extract switch and the streams grouping to improve compression.
So it would probably make sense to tackle this before addressing any of the other issues...
The text was updated successfully, but these errors were encountered:
Part of the nature of precomp calls for the frequent use of parsers and file-type recognition code.
This is a tedious task, as every parser needs to be manually written and tuned.
It is also very prone to errors and mismatches because of the different implementations of each standard and the fact that the streams are often only a part of the whole file so the program is flying blind.
I believe that having a robust and accurate universal type detection code is not only posible but probably easier to implement than the current system using the method described in here.
The parsers currently used by precomp are very good in their own right, yet there are a number of future applications where precomp will need more and more detection code, including some open issues:
#6 #20 #26 #44 and #86
There are other applications for a quick and correct type detection, as the use of dictionary preprocessors and/or custom compressors for text, exe preprocessing, mm preprocessing, and maybe even fast detection of header-less deflate streams, currently done on "brute mode".
There's also the proposed extract switch and the streams grouping to improve compression.
So it would probably make sense to tackle this before addressing any of the other issues...
The text was updated successfully, but these errors were encountered: