-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FrozenError
s
#168
base: master
Are you sure you want to change the base?
Fix FrozenError
s
#168
Conversation
Calling `.force_encoding` on a frozen String raises a 'FrozenError: can't modify frozen String'. This can be mitigated by calling making a copy before calling `.force_encoding` on that String.
👍 We might as well add |
If you mean |
Hi @bforma and @jethrodaniel , Thank you for your input and for the PR. I love the discussion and agree that using However, I'm super busy until December 25th and I'm not sure when I could test this approach and merge the changes. I'll keep observing for now and hope for beautiful things :-) Kindly, |
@@ -8,7 +8,7 @@ module CombinePDF | |||
def load(file_name = '', options = {}) | |||
raise TypeError, "couldn't parse data, expecting type String" unless file_name.is_a?(String) || file_name.is_a?(Pathname) | |||
return PDF.new if file_name == '' | |||
PDF.new(PDFParser.new(IO.read(file_name, mode: 'rb').force_encoding(Encoding::ASCII_8BIT), options)) | |||
PDF.new(PDFParser.new(IO.read(file_name, mode: 'rb').dup.force_encoding(Encoding::ASCII_8BIT), options)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dup
is redundant and will use up a bunch of memory. The dynamic String returned fromIO.read
(should probably be IO.binread
) is mutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
@@ -76,7 +76,7 @@ def decrypt | |||
|
|||
def set_general_key(password = '') | |||
# 1) make sure the initial key is 32 byte long (if no password, uses padding). | |||
key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').force_encoding(Encoding::ASCII_8BIT) | |||
key = (password.bytes[0..32].to_a + @padding_key)[0..31].to_a.pack('C*').dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dup is redundant and will use up a bunch of memory. The dynamic String returned from Array#pack
is mutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
@@ -41,7 +41,7 @@ class PDFParser | |||
# string:: the data to be parsed, as a String object. | |||
def initialize(string, options = {}) | |||
raise TypeError, "couldn't parse data, expecting type String" unless string.is_a? String | |||
@string_to_parse = string.force_encoding(Encoding::ASCII_8BIT) | |||
@string_to_parse = string.dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... why? Are we expecting frozen Strings?
In general I understand that this might be a correct approach that will minimize unexpected side-effects. However, on the other hand, this will use a lot of memory when reading larger PDF files.
# str = "0#{str}" if str.length.odd? | ||
out << unify_string([str].pack('H*').force_encoding(Encoding::ASCII_8BIT)) | ||
out << unify_string([str].pack('H*').dup.force_encoding(Encoding::ASCII_8BIT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(again) This dup is redundant and will use up a bunch of memory. The dynamic String returned from Array#pack
is mutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
@@ -243,22 +243,22 @@ def _parse_ | |||
########################################## | |||
elsif str = @scanner.scan(/\<[0-9a-fA-F]*\>/) | |||
# warn "Found a hex string" | |||
str = str.slice(1..-2).force_encoding(Encoding::ASCII_8BIT) | |||
str = str.slice(1..-2).dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dup is redundant and will use up a bunch of memory. The dynamic String returned from String#slice
should be mutable.
out << "\nendobj\n" if object[:indirect_reference_id] | ||
out.join.force_encoding(Encoding::ASCII_8BIT) | ||
out.join.dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why dup
the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I searched for .force_encoding
and replaced it with .dup.force_encoding
. Perhaps a too aggressive/naive approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps so... but perhaps not a bad idea.
I would consider automatically replacing all 'literal'.force_encoding(Encoding::ASCII_8BIT)
with 'literal'.b
... an operation that could be automated by searching for '.force_encoding(Encoding::ASCII_8BIT)
and replacing it with '.b
end | ||
object.delete :Length | ||
out << '>>'.force_encoding(Encoding::ASCII_8BIT) | ||
out << "\nstream\n#{object[:raw_stream_content]}\nendstream".force_encoding(Encoding::ASCII_8BIT) if object[:raw_stream_content] | ||
out << '>>'.dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, what about a binary literal? SO: https://stackoverflow.com/a/15843685/4025095
@@ -123,15 +123,15 @@ def format_hash_to_pdf(object) | |||
# if the object is not a simple object, it is a dictionary | |||
# A dictionary shall be written as a sequence of key-value pairs enclosed in double angle brackets (<<...>>) | |||
# (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)). | |||
out << "<<\n".force_encoding(Encoding::ASCII_8BIT) | |||
out << "<<\n".dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, what about a binary literal? SO: https://stackoverflow.com/a/15843685/4025095
@@ -89,7 +89,7 @@ def set_general_key(password = '') | |||
# # if document metadata is not being encrypted, add 4 bytes with the value 0xFFFFFFFF. | |||
if actual_object(@encryption_dictionary[:R]) >= 4 | |||
if actual_object(@encryption_dictionary)[:EncryptMetadata] == false | |||
key << "\xFF\xFF\xFF\xFF".force_encoding(Encoding::ASCII_8BIT) | |||
key << "\xFF\xFF\xFF\xFF".dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the interpreter requires dup
it's a little odd... Maybe there's a way to indicate that a String literal is ASCII without creating an additional object (see: https://stackoverflow.com/a/15843685/4025095 )?
After all, the object's lifetime os very short (only created to be inserted into the existing String).
@@ -137,7 +137,7 @@ def decrypt_AES(encrypted, encrypted_id, encrypted_generation, _encrypted_filter | |||
object_key = @key.dup | |||
object_key << [encrypted_id].pack('i')[0..2] | |||
object_key << [encrypted_generation].pack('i')[0..1] | |||
object_key << 'sAlT'.force_encoding(Encoding::ASCII_8BIT) | |||
object_key << 'sAlT'.dup.force_encoding(Encoding::ASCII_8BIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the interpreter requires dup
it's a little odd... Maybe there's a way to indicate that a String literal is ASCII without creating an additional object (see: https://stackoverflow.com/a/15843685/4025095 )?
After all, the object's lifetime os very short (only created to be inserted into the existing String).
@bforma , Could you state which Ruby version you're experiencing this issue with? I don't get it on my machine. Maybe you could author a quick Ruby example to reproduce the issue? Thanks! |
We are running with Ruby 2.6.5. All of our Ruby source files have the Some of our tests failed due to a |
I just noticed that our failing tests define a Perhaps this PR is not needed? |
Perhaps it isn't needed in the short term, but it does shine a light on an issue with literals in the source code. The source code could benefit from caching all the literals in binary form and then using them as needed. This could minimize object creation and improve performance for long running applications. |
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
Fix: boazsegev#168 Ref: https://bugs.ruby-lang.org/issues/20205 Ruby 3.4 will emit deprecation warnings when mutating string literals, and a future version is likely to make frozen string literals the default.
.dup
before.force_encoding
Calling
.force_encoding
on a frozen String raises a 'FrozenError:can't modify frozen String'.
This can be mitigated by making a copy before calling
.force_encoding
on that String.