Some fundamental concerns #73
Replies: 2 comments 4 replies
-
You concerns are very valid and we have so far looked into all of them. Let me break it down:
So although many caveats apply the approach is at least based on some theoretically valid assumptions and we have seen it work with approiately configured machines. Now to answer your question in full: We just do not know if the values that GitHub claims that the machines have are correct. They can say the machine has a fixed frequency or say that it is an AMD Epyc processor, but in reality it is not. The hypervisor values can be faked. This Eco-CI can only give you a sound estimation, but nothing more. On top of that Eco-CI is not intended to provide "estimations for optimization", but really only for accounting. And for that the approach is quite feasible. We have looked into the idea of optimization in this case study: https://www.green-coding.io/case-studies/ci-pipeline-energy-variability/ and have seen that due to network latency and resource over-subscription even super-simple pipelines tend to fluctuate 30% or more. So nop optimization is possible in Github Actions from the start no matter how accurate Eco-CI is :) Regarding the overhead: Yes, this is a fair point. It is too high. Period. The way forward is to implement even stronger caching through a container and also pre-train the model. This would make the overhead probably -90% or more. See Issues and approaches here: We are very happy to accept PRs on this as Eco-CI is currently one of our free open source works for the community. Also a nudge: Since I saw you work for Siemens maybe you can even convince your employer to support an open source project financially. We would be very grateful and super eager to devote more time on making Eco-CI better and hope we have already given some good head-start work. With more time for us to work on it there is more good work to come :) In any case: Thanks for taking the time to write the Discussion. I believe this is very helpful for us and also for anyone else who reads it. ty! |
Beta Was this translation helpful? Give feedback.
-
I already pinged you in a different thread, but wanted to answer here for completeness. We have overhauled the plugin and PR #76 has removed a lot of dependencies. No packages are installed anymore and no docker container downloaded. Happy if you give it a spin and looking forward to your opinion now. Also it is unclear to me why Github still takes 6s for the measurement step, as detailed in the PR. Maybe you have some pointers / ideas. |
Beta Was this translation helpful? Give feedback.
-
Hi all!
While looking into some high-level slides about green computing and the Green Software Foundation, I thought: "Why not make this concrete? Why not having a CI extension that tells me what I roughly emitted?" Then I found this and was pleased that it already exists. But now, after trying this out and thinking about this further, I'm no longer that sure. Here are my concerns:
How realistic are the numbers produced by this measurement?
As far as I understood, the model basically just looks at CPU usage. It does not consider base consumption of the runner or peaks caused by peripherals such as storage or network interfaces (not to speak of GPU accelerators). Did you compare numbers generated by the model on a local server with a power meter attached to that machine? Do the numbers realistically scale? Can we ignore the other factors without getting even relative results wrong? I'm lacking confidence that visualized numbers are not suggesting wrong optimizations. At that leads to my second concern.
How to avoid wrongly optimizing the runner at the price of causing emissions, possibly high ones, elsewhere?
Do you have numbers on what caching services consume? Looking at #14, it does not seem to be that clear. It's fair to ignore external factors while optimizing, provided the ratio between their estimated impact and the one under our own control is negligible (say, 1:10).
Last but not least: I would strongly recommend to not configure these CI steps unconditionally. Activate them every n-th run or on explicit request to get an update after noteworthy pipeline changes. Even the cached execution is still too heavy IMHO, giving all those dependencies it installs. And then it looks like the execution time can also go up (https://github.com/siemens/kas/actions/runs/9336544184/job/25697237499 vs. https://github.com/siemens/kas/actions/runs/9331141542/job/25685599701). Is that something your https://github.com/green-kernel is supposed to address?
I know we have to start somewhere, but given that overall emissions count, I'm a bit reluctant to perform and promote local measurements too early. Then I would rather stick with existing, "free" KPIs like reducing the CI time or avoiding job executions altogether.
Beta Was this translation helpful? Give feedback.
All reactions