Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Opta] add formation and lineup changes #333

Merged

Conversation

DriesDeprest
Copy link
Contributor

Added recognition of player position changes in formation changes and introduced more detailed starting positions for players.

However, I'm still struggling on how to properly set the detailed starting positions for players. I want to use the qualifier information of the team setup events (ID 34) present in the F24 / MA3 files, since they contain more detailed information about the starting positions of the players in comparison with the starting positions present in the F7 / MA1 files.

In the draft implementation of my PR, I update the starting positions of the player based on the team setup events, but this results into a dataclasses.FrozenInstanceError: cannot assign to field 'starting_position'.

Should we refactor the code so that the creation of teams (and players) doesn't rely only on F7 / MA1 files, but also on F24 / MA3 files? Thoughts e.g. @probberechts ?

Also, the position labels I've chosen in theformation_mapping are based on my best effort estimations of what the correct description of a position in the zone shown in the StatsPerform documentation appendix. Below, an example of how the positional information is described in the appendix.

image

@DriesDeprest DriesDeprest marked this pull request as draft July 4, 2024 12:14
@DriesDeprest
Copy link
Contributor Author

I've resolved the dataclasses.FrozenInstanceError issue faced, by looking at the Formation_Place in the F7 files to determine granular positions of players.
All should be good now!

@DriesDeprest DriesDeprest marked this pull request as ready for review August 5, 2024 11:46
@DriesDeprest
Copy link
Contributor Author

@koenvo @JanVanHaaren any chance to get a review on this? :-)

Copy link
Collaborator

@JanVanHaaren JanVanHaaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request looks good to me. Thank you!

@koenvo
Copy link
Contributor

koenvo commented Sep 23, 2024

Can you merge current master in?

# Conflicts:
#	kloppy/tests/test_opta.py
#	kloppy/tests/test_statsperform.py
@DriesDeprest
Copy link
Contributor Author

done

@koenvo koenvo added this to the 3.16.0 milestone Oct 22, 2024
@koenvo koenvo merged commit 30c262d into PySport:master Oct 22, 2024
19 checks passed
@UnravelSports
Copy link
Contributor

I've put this in the kloppy development Slack channel too, and even though this has been merged, I think something is broken here. Namely:

dataset = opta.load(
    f24_data='.data/f24-eventdetails.xml',
    f7_data='.data/srml-matchresults.xml'
)
print(len([x for x in dataset.events if x.event_type == EventType.SUBSTITUTION]))
>>> 10
print(len([x for x in dataset.events if x.event_type == EventType.GENERIC and x.event_name == 'player on']))
>>> 10
print(len([x for x in dataset.events if x.event_type == EventType.GENERIC and x.event_name == 'player off']))
>>> 0
  1. I understand the SUBSTITUTION events are there for the purpose of tracking Formation change?
  2. But why are the PLAYER_ON events still in the dataset and why are the PLAYER_OFF events missing? Should they not either all be replaced by the SUB event, or we keep all 3 (PLAYER_ON, PLAYER_OFF and SUB)
  3. If we now do dataset.to_df() we miss the "GENERIC:player off" events. And the "SUBSTITUTION" events are in the dataframe and the player_id column contains the player id of the player coming on, however we don’t have any information for the player coming off inside the dataframe. Should this stuff be put in the receiver_player_id? This feels like we’d be abusing how the dataframe works, however it also seems to make sense to do this over adding a column to the dataframe because that will probably break a lot of stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants