- Client query load balancing server for inferencing server address.
- Load balancer return serving address using DRL.
- Client send actual request to inferencing server.
- Add artifical network overhead upon receiving inferencing response.
- Return overall results to evaluater to generate reward for DRL.
-
Make test request series and evaluater
-
Implement random load balancer
-
Implement region aware load balancer
-
Run drl for 1 an hour
-
Compare the result of 3 load balancer
-
Automate deployment.
- Include dockerfile of the service server. Also probably include some bash command for multiple docker each serving a single model.
- Scripts to install prerequisite.
client.py:
- Load ImageNet validation images.
- Query loadbalancer.py for inferencing server address.
- Send request to inferencing server.
- Calculate artficial network overhead.
- Send result to evaluater.py.
loadbalancer.py:
- Generate 'oberservation' for DRL upon client query for serving address.
- Return DRL's action(serving address) to client.
evaluater.py:
- Load ImageNet validation solutions.
- Listen from client response reports, parse and feed to drl.py
servermonitor.py
- Listen for state reports from servermonitor.py
- Send gathered state reports to drl.py
reportstate.py:
- Collect server's state.
- Send it to servermonitor.py.
drl.py:
- Uses request and server_state as state
- DQN to aprox. q function
- Return server/model as action