Navigation Path Probability Estimation and Pattern Prediction

A library to find the Probability Estimation of Navigation Paths and their Pattern Prediction.

The library helps in identifying the high probability trail path in a data. This navigation probability provides the means to analyze and predict the next link choice of unseen navigation sessions. Currently, the library allows three types of probability estimation from the path data -

State Probability
Transition Probability
Path or Trail Probability

State Probability -

The initial probability of a state is estimated as the proportion of times the corresponding state was requested by the user. This probability is obtained by dividing the number of times a state was browsed by the total number of states browsed.

Transition Probability -

The probability of a transition between two states is estimated by the ratio of the number of times the sequence was visited to the number of total paths where the from page was visited.

Path or Trail Probability -

The probability of a trail is estimated by the product of the initial probability of the first state in the trail and the transition probabilities of the next transitions taken in a path. The chain rule is applied in order to compute all path probabilities.

How to use -

For the probability estimations

import pandas as pd
import path_nav as nv

data = {
	"other_data": [1,4,5],
    "path": [
    	["A", "B", "C", "A", "C"],
    	["B", "D", "B", "A"],
    	["A", "C", "B", "A", "D"]
    ],
    "conversions": [0, 0, 1],
}

df = pd.DataFrame(data)
print(df)

   other_data             path  conversions
0           1  [A, B, C, A, C]            0
1           4     [B, D, B, A]            0
2           5  [A, C, B, A, D]            1

# To find the state probability
state_probability = nv.state_probability(df, 'path')
print(state_probability)

  State  State_probability
0     D           0.142857
1     A           0.357143
2     C           0.214286
3     B           0.285714

# To add the start and conversion values to the path (optional)
df = nv.add_start_end(df,'path','conversions')
print(df)

   other_data                                path  conversions
0           1        [start, A, B, C, A, C, exit]            0
1           4           [start, B, D, B, A, exit]            0
2           5  [start, A, C, B, A, D, conversion]            1

# To find the transition probability
transition_df = nv.transition_probability(df, 'path')
print(transition_df)

from_sitesection to_sitesection  transition_probability
             B           exit                0.000000
             B     conversion                0.000000
             B              D                0.333333
             B              A                0.666667
             B              C                0.333333
             D           exit                0.000000
             D     conversion                0.500000
             D              B                0.500000
             D              A                0.000000
             D              C                0.000000
         start           exit                0.000000
         start     conversion                0.000000
         start              B                0.333333
         start              D                0.000000
         start              A                0.666667
         start              C                0.000000
             A           exit                0.333333
             A     conversion                0.000000
             A              B                0.333333
             A              D                0.333333
             A              C                0.666667
             C           exit                0.500000
             C     conversion                0.000000
             C              B                0.500000
             C              D                0.000000
             C              A                0.500000

# To find the path probability
path_df = nv.path_probability(df, 'path', transition_df)
print(path_df)

   other_data                                path  conversions  path_probability
0           1        [start, A, B, C, A, C, exit]            0          0.012346
1           4           [start, B, D, B, A, exit]            0          0.012346
2           5  [start, A, C, B, A, D, conversion]            1          0.024691

Additional functions -

# To convert the path column to string data type
df = nv.convert_to_str(df, "path")


# To convert the path column to list data type
df = nv.convert_to_list(df, "path")

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
path_nav.py		path_nav.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Navigation Path Probability Estimation and Pattern Prediction

State Probability -

Transition Probability -

Path or Trail Probability -

How to use -

Additional functions -

About

Releases

Packages

Languages

visitishan/Navigation-path-prediction-and-probability

Folders and files

Latest commit

History

Repository files navigation

Navigation Path Probability Estimation and Pattern Prediction

State Probability -

Transition Probability -

Path or Trail Probability -

How to use -

Additional functions -

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages