-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi source support #101
base: master
Are you sure you want to change the base?
Multi source support #101
Conversation
at a high level, does it break compatibility, and what does the "directory" structure look like? |
No compatibility break, if you provide a single file it will work as before. When you provide a directory to KSM it will traverse recursively and create a source for every file using the correct parser so you can even use csv and yml at the same time. This will work also for the Github source for example, you can provide a directory in a respository and the source will pull all files inside the directory. |
is there one API call at the directory level to see if the timestamp has changed, or do we have to do one API call by file discovered in that directory? I don't want to start overwhelming GitHub for example with API calls. I guess this implementation will also be dependent on the source type, right? |
I have to check better api wise but caching was already in my mind to minimize update time and load on the backend. The commit can be used to check if anything changed at all from last pull and I guess it's possible also to fetch a list of changed files via API. |
Sure have a look - this can be done after the other PR, as it will take longer
—
Stephane Maarek
Book a call with us
https://www.conduktor.io/
…--- original message ---
On May 26, 2021, 7:41 AM GMT+1 ***@***.*** wrote:
I have to check better api wise but caching was already in my mind to minimize update time and load on the backend. The commit can be used to check if anything changed at all from last pull and I guess it's possible also to fetch a list of changed files via API.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
--- end of original message ---
|
src/main/scala/com/github/conduktor/ksm/source/FileSourceAcl.scala
Outdated
Show resolved
Hide resolved
LGTM 👍 |
I realized that the code was completely wrong, I need to call the applySourceAcls with all values not for every parsing context. So what I do now is, first I collect all acls either from the reader or cache and then apply on the result
I stabilized the current code (realized it was actually wrong) and implemented directory support only for the file source, does it make sense merging that one by itself and leave the rest supporting only single file for now? I can add some documentation for it and when I manage implement multi source for the other ones create another PR. p.s. I also need still to add some tests for the multisource support in the File source. |
This is only a first draft to see if the general direction makes sense. Should already work for the file source while it is not implementeds for the others.
After some feedback and once agreed how we do it I will proceed implementing it for the rest of the sources