Use numpy.concatenate instead of hstack, vstack #1009

chillenzer · 2022-09-15T10:48:43Z

Hi everybody,
I was just reading through Episode 2 and was surprised about the appearance of numpy.hstack and numpy.vstack. Isn't it more useful to just introduce numpy.concatenate with an appropriate axis kwarg? I personally never use the {h,v}stack functions because they lack the generality to handle some cases for higher dimensions (and whenever I did it took me a while to sort out which of the 3, 4 or 5 axes of my array is considered "horizontal"). Even if the tutorial (at least at that point) is only concerned with 2D data, would it hurt to give them the exact same functionality but sneaking in the generality they might need for their own use case? One could even argue that it is simpler

to have only one function name to remember.
not to rely on them having the correct geometrical picture in mind when they could have an unambiguously enumerated axis instead.

Admittedly, this might be just personal preference (my own as well as of the people I work with), so I would be interested to hear if there are some rational arguments for the current way of doing it. If not, I'm happy to provide this small patch myself.
Best,
Julian

The text was updated successfully, but these errors were encountered:

shermanlo77 · 2022-12-19T12:07:18Z

It's an interesting point! I think numpy.hstack() and numpy.vstack() would help those who haven't grasped the idea of dimensions and axis yet. For example, putting a lego block on top on another is easier to think about than which dimension is which.

But a note on numpy.concatenate() would be useful for the stronger students

chillenzer · 2022-12-19T12:35:41Z

Okay, I see your point. When teaching this material, we had some discussions with the learners about how the data is laid out and what the provided axis actually means. This is particularly tricky in 2D arrays where there is a coincidental symmetry between the axis arguments and their complement; e.g. when np.sum(..., axis=0) reduces a 5d array to a 4d array it is pretty clear that the sum was taken along axis 0 while the same for 2d to 1d could either mean "along axis 0" or "only axis 0 is kept" which is often symmetrical in shape in such situations.

I guess one could argue that the provided representation when printing gives a reasonable intuition for 2D arrays, still this does not straightforwardly generalize to higher dimensions (at least in my head).

But I think the compromise you suggest could be okay. Shall I write up a short info box on this?

shermanlo77 · 2022-12-19T13:55:53Z

That's a good point, the episode goes on to explain the difference between numpy.mean(data, axis=0) and numpy.mean(data, axis=1) so the students should know about dimensions and axis

I think your/our suggestion on using concatenate() could be a good candidate for a pull request

chillenzer · 2022-12-19T14:20:28Z

Great! I will write something up and create a PR. Doesn't have high priority though, so might take a while to arrive. =)

chillenzer · 2022-12-19T14:22:55Z

In fact, np.concatenate could make the whole axis thing even clearer than np.mean because one can immediately follow the change in shape as opposed to manually inspecting the data which at that point is definitely not 100% obvious to compare.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use numpy.concatenate instead of hstack, vstack #1009

Use numpy.concatenate instead of hstack, vstack #1009

chillenzer commented Sep 15, 2022 •

edited

Loading

shermanlo77 commented Dec 19, 2022

chillenzer commented Dec 19, 2022

shermanlo77 commented Dec 19, 2022

chillenzer commented Dec 19, 2022

chillenzer commented Dec 19, 2022

Use numpy.concatenate instead of hstack, vstack #1009

Use numpy.concatenate instead of hstack, vstack #1009

Comments

chillenzer commented Sep 15, 2022 • edited Loading

shermanlo77 commented Dec 19, 2022

chillenzer commented Dec 19, 2022

shermanlo77 commented Dec 19, 2022

chillenzer commented Dec 19, 2022

chillenzer commented Dec 19, 2022

chillenzer commented Sep 15, 2022 •

edited

Loading