Replies: 2 comments 2 replies
-
A recurring misunderstanding is that Kolmogorov Complexity can't be applied to data, only to a model of the data. Since that's the case, one still needs an error term that quantifies the discrepancy of the model from the data. Apart from that, the idea behind Kolmogorov Complexity is that a seemingly simple expression can have hidden complexity, often extending into fractal or chaotic regimes. This is good in a way in that it's the entire idea behind symbolic regression. Isn't it? |
Beta Was this translation helpful? Give feedback.
-
Rather argue against your authoritatively postured confusion, I'll just ask one question that is similar to one I use to test large language models: What is the shortest python program you can come up with that outputs the following string: Now, to illustrate your confusion about "error terms", lets examine a possible "model":
Note that the output is almost a Kolmogorov Complexity approximation of the string as it outputs: It fails because it leaves off the trailing '11111'. But it becomes a Kolmogorov Complexity approximation, if one modifies the program to, instead, be:
Note that by adhering to the definition of Kolmogorov Complexity (ie: lossless compression of the data) the loss function is 72 rather than 64 + some ill defined "error term". |
Beta Was this translation helpful? Give feedback.
-
Since any finite dataset will have finite precision, it is apparent that a given symbolic formula's residual errors will have finite precision. That being the case, error residuals can be treated as additive program literals in a expression that approximates the Kolmogorov Complexity of the dataset: An optimal lossless compression of the data.
This would approximate the gold standard for generalization in machine learning: Solomonoff Induction.
It seems the only real problems to solve in defining such a loss function are:
An example of 1) would be assigning a complexity measure to, say cosine vs arccosine vs arctan vs sqrt in a manner that is justifiable in terms of Kolmogorov Complexity. This, it seems is a previously solved problem in computer engineering where known machine language algorithms, of minimal length, are known -- taking into account that math libraries must incorporate all of such algorithms in a manner that minimizes the length of the total library. Pedantry regarding "arbitrary" choice of machine instruction set may be ignored for the simple reason that there is no such "arbitrary" choice of axiomatic basis for arithmetic in reality. No one serious bothers to even talk to pedants who insist that some random choice of symbols should be considered as principled as, say, Peano's.
An example of 2) would be, say, detecting when a rational number's repeating digit pattern -- which goes on for infinity -- must be truncated to not exceed the precision of the data-point being computed. In that case the error residual would not go on for infinity but be similarly truncated.
Finally consider that any finite computation system is a finite state machine which means it is a state space model.
Beta Was this translation helpful? Give feedback.
All reactions