Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

marginal values #8

Open
diegozea opened this issue Jun 23, 2016 · 21 comments
Open

marginal values #8

diegozea opened this issue Jun 23, 2016 · 21 comments

Comments

@diegozea
Copy link

Would be great to have the ability to show/calculate/store the marginal values of a table, when that is required.

Best,

@nalimilan
Copy link
Owner

Could you be a bit more specific? Does sum(tab, dims) or mean(tab, dims) do what you need?

@diegozea
Copy link
Author

Yes, I'm doing that ;)

image

But I was wondering if freqtable(..., marginal=true) could return a table like this one:

x1 x2 X
y1 1 2 3
y2 3 2 5
Y 4 4 8

@nalimilan
Copy link
Owner

In R there's an addmargins function. Given that it's even shorter than adding marginal=true, that could be a better solution. Would you make a PR to add it?

@diegozea
Copy link
Author

I'm sorry. I haven't time right now for working on that PR :/

@nalimilan
Copy link
Owner

Then I'll try to have a look later. Shouldn't be hard.

@diegozea
Copy link
Author

Thanks! There is no hurry.

@bkamins
Copy link
Collaborator

bkamins commented Dec 19, 2017

I have a similar usecase but rather for an equivalent of prop.table in R.
You can do it now using tab ./ sum(tab, dims), but it is such a common operation that maybe it should be handled by the package. I can imagine two options:

  1. additional wrapper function;
  2. keyword argument for margins.

How do you see it?

@nalimilan
Copy link
Owner

As I said, I'd rather go with a wrapper function. We could also imagine providing a function, say proptable, which would call freqtable and compute the proportions.

@nico202
Copy link

nico202 commented Nov 8, 2019

Hi, has this issue been closed without fixing the issue? How to add marginals?

Thanks, Nicolò

@bkamins
Copy link
Collaborator

bkamins commented Nov 8, 2019

see the referenced PR #19. You can use prop function.

@nico202
Copy link

nico202 commented Nov 8, 2019

Thanks, I was using release 0.3.1 where it seems the keyword is not exported.

Btw, maybe I'm doing it wrong.

> table(dat$A, dat$B)

    1 2 3 4
  1 2 2 2 2
  2 2 2 2 2
  3 2 2 2 2

 > addmargins(table(dat$A, dat$B))
     
       1  2  3  4 Sum
  1    2  2  2  2   8
  2    2  2  2  2   8
  3    2  2  2  2   8
  Sum  6  6  6  6  24

addmargins(table(dat$A, dat$B), 1)
     
      1 2 3 4
  1   2 2 2 2
  2   2 2 2 2
  3   2 2 2 2
  Sum 6 6 6 6
julia> freqtable(dat, :A, :B)
3×4 Named Array{Int64,2}
A ╲ B │ 1  2  3  4
──────┼───────────
12  2  2  2
22  2  2  2
32  2  2  2

 prop(freqtable(dat, :A, :B), margins = 1)
3×4 Named Array{Float64,2}
A ╲ B │    1     2     3     4
──────┼───────────────────────
10.25  0.25  0.25  0.25
20.25  0.25  0.25  0.25
30.25  0.25  0.25  0.25


prop(freqtable(dat, :A, :B), margins = (1,2))
3×4 Named Array{Float64,2}
A ╲ B │   1    2    3    4
──────┼───────────────────
11.0  1.0  1.0  1.0
21.0  1.0  1.0  1.0
31.0  1.0  1.0  1.0

@bkamins
Copy link
Collaborator

bkamins commented Nov 8, 2019

Have a look at help of prop, this is the way to use it:

julia> prop([1 2; 3 4], 1, 2)
2×2 Array{Float64,2}:
 1.0  1.0
 1.0  1.0

julia> prop([1 2; 3 4])
2×2 Array{Float64,2}:
 0.1  0.2
 0.3  0.4

julia> prop([1 2; 3 4], 1)
2×2 Array{Float64,2}:
 0.333333  0.666667
 0.428571  0.571429

julia> prop([1 2; 3 4], 2)
2×2 Array{Float64,2}:
 0.25  0.333333
 0.75  0.666667

julia> prop([1 2; 3 4], 1, 2)
2×2 Array{Float64,2}:
 1.0  1.0
 1.0  1.0

@nico202
Copy link

nico202 commented Nov 8, 2019

Thanks, but none of those is similar to what R's addmargins does (what's asked here)

@nico202
Copy link

nico202 commented Nov 8, 2019

I mean, return this:

x = freqtable(dat, :A, :B)

vcat(hcat(x, sum(x, dims = 2)), hcat(sum(x, dims = 1)..., sum(x)))

4×5 Named Array{Int64,2}
A ╲ hcat │  1   2   3   4   5
─────────┼───────────────────
12   2   2   2   8
22   2   2   2   8
32   2   2   2   8
46   6   6   6  24

preserving names and so on

@nico202
Copy link

nico202 commented Nov 8, 2019

Ugly, but this:

function addmargins(tab)
    x, y = names(tab)
    x = string.(x)
    y = string.(y)
    push!(x, "Sum")
    push!(y, "Sum")
    res = vcat(hcat(tab, sum(tab, dims = 2)), hcat(sum(tab, dims = 1)..., sum(tab)))
    setnames!(res, x, 1)
    setnames!(res, y, 2)
    res.dimnames = tab.dimnames
    res
end
4×5 Named Array{Int64,2}
A ╲ B    │   1    2    3    4  Sum
─────────┼────────────────────────
1        │   2    2    2    2    8
2        │   2    2    2    2    8
3        │   2    2    2    2    8
Sum      │   6    6    6    6   24

@bkamins
Copy link
Collaborator

bkamins commented Nov 8, 2019

Ah - understood. I do not think it is supported.

Out of curiosity - in what situation would you need it (apart from the fact that R provides it)?
I am asking because I never needed such functionality (and I use FreqTables.jl on daily basis) + it is in general unsafe, as if you change the contents of such table the margins get invalidated, so you loose consistency of your table.

@nico202
Copy link

nico202 commented Nov 8, 2019

In a report or a journal paper it's a nice way to present some data. In this specific case: I have an experiment with outliers. I want to show how many outliers are present for each condition, the sample size, the number of valid/invalid trials... I care about the proportion of valid/unvalid trials, but the raw numbers are more important (25% out of 4 or out of 10000 makes a big difference here).

Those tables sumarize it well:

7-8y old

#+call: outlier-frequency-by-age[:exports results](age="7-8y")

#+RESULTS:
| condoutlier \ cond | auditory | haptic | visual | crossmodal | Sum |
|--------------------+----------+--------+--------+------------+-----|
| false              |       18 |     37 |     38 |         35 | 128 |
| true               |       20 |      1 |      0 |          3 |  24 |
| Sum                |       38 |     38 |     38 |         38 | 152 |

10-11y old

#+call: outlier-frequency-by-age[:exports results](age="10-11y")

#+RESULTS:
| condoutlier \ cond | auditory | haptic | visual | crossmodal | Sum |
|--------------------+----------+--------+--------+------------+-----|
| false              |       33 |     46 |     46 |         46 | 171 |
| true               |       13 |      0 |      0 |          0 |  13 |
| Sum                |       46 |     46 |     46 |         46 | 184 |

adults

#+call: outlier-frequency-by-age[:exports results](age="adults")

#+RESULTS:
| condoutlier \ cond | auditory | haptic | visual | crossmodal | Sum |
|--------------------+----------+--------+--------+------------+-----|
| false              |       15 |     16 |     16 |         16 |  63 |
| true               |        1 |      0 |      0 |          0 |   1 |
| Sum                |       16 |     16 |     16 |         16 |  64 |

(the syntax here is emac's org mode, julia's code that's called is:
addmargins(freqtable(data, :condoutlier, :cond, subset = data.agegroup .== age)))

For the three age groups you see N of subjects, n of trials, n of outliers by conditions... Quick and simple (even if in R is still even simplier, because you can call it with freqtable(data, :A :B, :C) and you get many tables, in the example above I have to run the function 3 times).

@nico202
Copy link

nico202 commented Nov 8, 2019

Also, maybe conversion to string can be replaced by something like Union{eltype(x),AbstractString}?

@bkamins
Copy link
Collaborator

bkamins commented Nov 8, 2019

I agree with this use-case, but I would rather create a custom display function for this (that could e.g. automatically also use MIME-type to output HTML, LaTeX etc.) so that you have a separate Model from View.

@nico202
Copy link

nico202 commented Nov 8, 2019

It might make sense, but you don't always want to display it with the marginals. So I don't know which is the best way to organize this. Any idea?

@nalimilan
Copy link
Owner

I agree something like addmargins can be useful. It should also allow specifying specific margins to which totals must be added.

Something which is annoying in R is when you want to add margins to a table of proportions: addmargins(prop.table(table(...), 1)) gives correct row sums (equal to 1) but meaningless column sums (equal to sums of row proportions) and grand total (equal to 2). So maybe we should try to find a more convenient API? For example, instead of a function we could add a keyword argument to freqtable and prop. Or maybe introduce addmargins, but also a keyword argument to prop since that's where the problem arises (for raw counts addmargins is OK).

@nalimilan nalimilan reopened this Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants