Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer encryption metadata in the commit when DirectUpdateHandler2.closeWriter is called. #113

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bruno-roustant
Copy link
Contributor

@bruno-roustant bruno-roustant commented Oct 18, 2024

The problem is described in the corresponding issue.
It happens when documents are inserted but without commit, and the IndexWriter is closed, which triggers an auto-commit from DirectUpdateHandler2.closeWriter. Unfortunately this auto-commit does not really call the DirectUpdateHandler2.commit code but rather copies it, and we miss a call to DirectUpdateHandler2.shouldCommit to manage the encryption commit metadata.

Actually this requires to change DirectUpdateHandler2.closeWriter in the Solr project.
This PR shows the small required change (see comment in the DirectUpdateHandler2 copy), and verifies the fix with a test.
But I don't think I'll merge this PR. I'll rather change upstream and wait for the next release to upgrade here.

SolrIndexWriter.setCommitData(writer, cmd.getVersion(), cmd.commitData);
writer.commit();
}
//TODO: change for encryption - end
Copy link
Contributor Author

@bruno-roustant bruno-roustant Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the small change required for encryption. I'll do the change upstream.
Actually, from the comment line 959 above, ideally we should call commit(cmd) here instead of copying the code from the commit() method. But apparently many tests failed. So, given this code is a copy of commit(), we should also copy the shouldCommit() call part (which fixes our issue here with encryption).

@bruno-roustant
Copy link
Contributor Author

@dsmiley why did you close this PR?
It seems you wrote some comments, but I don't see them anymore.

@dsmiley
Copy link

dsmiley commented Oct 18, 2024

:embarrassed: totally an accident; sorry! I saw it was closed (didn't notice it was me!), so I deleted my comment to shift it to #109

I suppose IndexWriter.commit() is risky for index encryption... we must always guarantee there is commit metadata? This seems fragile; hard to guarantee, and with bad consequences.

I did find-usages and I see two other callers, and would like your opinion:

  • CoreContainer.reload
  • SolrIndexSplitter.doSplit

But finding all callers and adding protections, again, feels fragile. Like if we do index encryption, maybe we need a stronger lower level guarantee that the metadata is there. I haven't thought about this too much other than observing the fragility.

@bruno-roustant
Copy link
Contributor Author

For the SolrIndexSplitter.doSplit, it is already handled in EncryptionUpdateHandler which extends DirectUpdateHandler2.split(). This method creates the SolrIndexSplitter and gives the Command which contains the metadata.

For CoreContainer.reload, I missed it. Good point.
I don't think there are other callers currently. But I agree this is fragile while Solr evolves. I hope that when this encryption module comes into Solr, we can encapsulate the Lucene IndexWriter in a way that it does the commit metadata transfer systematically at nearly the Lucene level.

@bruno-roustant
Copy link
Contributor Author

I created a PR upstream to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants