Skip to content

Commit d29f302

Browse files
authored
Add HTTP webserver interface (#6)
1 parent 8e96931 commit d29f302

File tree

10 files changed

+382
-53
lines changed

10 files changed

+382
-53
lines changed

Dockerfile.webserver

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
FROM quay.io/invidious/youtube-trusted-session-generator:latest
2+
3+
COPY docker/scripts/startup-webserver.sh ./

README.md

Lines changed: 35 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Description
44

5-
This script will output two parameters: po_token and visitor_data. Needed for passing YouTube checks in Invidious.
5+
This script will output two parameters: po_token and visitor_data. Needed for passing YouTube checks in Invidious or the program that use the po_token functionality.
66

77
## What's po_token
88

@@ -15,28 +15,49 @@ These identity tokens (po_token and visitor_data) generated using this tool will
1515
- You have to run this command on the same public IP address as the one blocked by YouTube. Not necessarily the same machine, just the same public IP address.
1616
Subsequent usage of this same token will work on the same IP range or even the same ASN. The point is to generate this token on a blocked IP as "unblocked" IP addresses seems to not generate a token valid for passing the checks on a blocked IP.
1717

18-
## Tutorial without Docker
18+
## Tutorials for "oneshot" command: run the program and get the po_token and visitor_data values
19+
20+
### Tutorial with Docker
21+
1. Run the script: `docker run quay.io/invidious/youtube-trusted-session-generator`
22+
2. Copy paste the values of these the two parameters (po_token and visitor_data) in config.yaml
23+
```
24+
po_token: XXX
25+
visitor_data: XXX
26+
```
27+
3. Restart Invidious or the program that use the po_token functionality.
28+
29+
### Tutorial without Docker
1930
1. Install Chromium or Google Chrome.
2031
2. Create a new virtualenv: `virtualenv venv`
2132
3. Activate the virtualenv: `source venv/bin/activate`
2233
4. Install the dependencies: `pip install -r requirements.txt`
23-
5. Run the script: `python index.py`
34+
5. Run the script: `python potoken-generator.py --oneshot`
2435
6. Copy paste the values of these the two parameters (po_token and visitor_data) in config.yaml
2536
```
2637
po_token: XXX
2738
visitor_data: XXX
2839
```
29-
7. Restart Invidious.
40+
7. Restart Invidious or the program that use the po_token functionality.
3041

31-
## Tutorial with Docker
32-
1. Run the script: `docker run quay.io/invidious/youtube-trusted-session-generator`
33-
2. Copy paste the values of these the two parameters (po_token and visitor_data) in config.yaml
34-
```
35-
po_token: XXX
36-
visitor_data: XXX
37-
```
38-
3. Restart Invidious.
3942

40-
## Why running as root for Docker?
43+
### Why running as root for Docker?
44+
45+
In "headless: false", Chromium does not support sanboxing when it is not ran by root user.
46+
47+
## Tutorials for "always running" program: Get po_token on demand using HTTP.
48+
49+
### Tutorial with Docker
50+
Run the program: `docker run -p 8080:8080 quay.io/invidious/youtube-trusted-session-generator:webserver`
51+
52+
### Tutorial without Docker
53+
1. Install Chromium or Google Chrome.
54+
2. Create a new virtualenv: `virtualenv venv`
55+
3. Activate the virtualenv: `source venv/bin/activate`
56+
4. Install the dependencies: `pip install -r requirements.txt`
57+
5. Run the program: `python potoken-generator.py`
58+
59+
### Usage of the HTTP API
60+
61+
Send your requests to http://localhost:8080/token in order to obtain your po_token.
4162

42-
In "headless: false", Chromium does not support sanboxing when it is not ran by root user.
63+
You can also force refresh the po_token in the cache by sending a request to http://localhost:8080/update.
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#!/bin/sh
2+
3+
echo "[INFO] internally launching GUI (X11 environment)"
4+
5+
XVFB_WHD=${XVFB_WHD:-1280x720x16}
6+
7+
echo "[INFO] starting Xvfb"
8+
Xvfb :99 -ac -screen 0 $XVFB_WHD -nolisten tcp > /dev/null 2>&1 &
9+
sleep 2
10+
11+
echo "[INFO] launching chromium instance"
12+
13+
# Run python script on display 0
14+
DISPLAY=:99 python potoken-generator.py --bind 0.0.0.0

docker/scripts/startup.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,5 @@ sleep 2
1010

1111
echo "[INFO] launching chromium instance"
1212

13-
# Run python script on display 99
14-
DISPLAY=:99 python index.py
13+
# Run python script on display 0
14+
DISPLAY=:99 python potoken-generator.py --oneshot

index.py

Lines changed: 0 additions & 37 deletions
This file was deleted.

potoken-generator.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
import potoken_generator.main
2+
3+
if __name__ == '__main__':
4+
potoken_generator.main.main()

potoken_generator/__init__.py

Whitespace-only changes.

potoken_generator/extractor.py

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
import asyncio
2+
import dataclasses
3+
import json
4+
import logging
5+
import time
6+
from dataclasses import dataclass
7+
from pathlib import Path
8+
from tempfile import mkdtemp
9+
from typing import Optional
10+
11+
import nodriver
12+
13+
logger = logging.getLogger('extractor')
14+
15+
16+
@dataclass
17+
class TokenInfo:
18+
updated: int
19+
potoken: str
20+
visitor_data: str
21+
22+
def to_json(self) -> str:
23+
as_dict = dataclasses.asdict(self)
24+
as_json = json.dumps(as_dict)
25+
return as_json
26+
27+
28+
class PotokenExtractor:
29+
30+
def __init__(self, loop: asyncio.AbstractEventLoop,
31+
update_interval: float = 3600,
32+
browser_path: Optional[Path] = None) -> None:
33+
self.update_interval: float = update_interval
34+
self.browser_path: Optional[Path] = browser_path
35+
self.profile_path = mkdtemp() # cleaned up on exit by nodriver
36+
self._loop = loop
37+
self._token_info: Optional[TokenInfo] = None
38+
self._ongoing_update: asyncio.Lock = asyncio.Lock()
39+
self._extraction_done: asyncio.Event = asyncio.Event()
40+
self._update_requested: asyncio.Event = asyncio.Event()
41+
42+
def get(self) -> Optional[TokenInfo]:
43+
return self._token_info
44+
45+
async def run_once(self) -> Optional[TokenInfo]:
46+
await self._update()
47+
return self.get()
48+
49+
async def run(self) -> None:
50+
await self._update()
51+
while True:
52+
try:
53+
await asyncio.wait_for(self._update_requested.wait(), timeout=self.update_interval)
54+
logger.debug('initiating force update')
55+
except asyncio.TimeoutError:
56+
logger.debug('initiating scheduled update')
57+
await self._update()
58+
self._update_requested.clear()
59+
60+
def request_update(self) -> bool:
61+
"""Request immediate update, return False if update request is already set"""
62+
if self._ongoing_update.locked():
63+
logger.debug('update process is already running')
64+
return False
65+
if self._update_requested.is_set():
66+
logger.debug('force update has already been requested')
67+
return False
68+
self._loop.call_soon_threadsafe(self._update_requested.set)
69+
logger.debug('force update requested')
70+
return True
71+
72+
@staticmethod
73+
def _extract_token(request: nodriver.cdp.network.Request) -> Optional[TokenInfo]:
74+
post_data = request.post_data
75+
try:
76+
post_data_json = json.loads(post_data)
77+
visitor_data = post_data_json['context']['client']['visitorData']
78+
potoken = post_data_json['serviceIntegrityDimensions']['poToken']
79+
except (json.JSONDecodeError, TypeError, KeyError) as e:
80+
logger.warning(f'failed to extract token from request: {type(e)}, {e}')
81+
return None
82+
token_info = TokenInfo(
83+
updated=int(time.time()),
84+
potoken=potoken,
85+
visitor_data=visitor_data
86+
)
87+
return token_info
88+
89+
async def _update(self) -> None:
90+
try:
91+
await asyncio.wait_for(self._perform_update(), timeout=600)
92+
except asyncio.TimeoutError:
93+
logger.error('update failed: hard limit timeout exceeded. Browser might be failing to start properly')
94+
95+
async def _perform_update(self) -> None:
96+
if self._ongoing_update.locked():
97+
logger.debug('update is already in progress')
98+
return
99+
100+
async with self._ongoing_update:
101+
logger.info('update started')
102+
self._extraction_done.clear()
103+
try:
104+
browser = await nodriver.start(headless=False,
105+
browser_executable_path=self.browser_path,
106+
user_data_dir=self.profile_path)
107+
except FileNotFoundError as e:
108+
msg = "could not find Chromium. Make sure it's installed or provide direct path to the executable"
109+
raise FileNotFoundError(msg) from e
110+
tab = browser.main_tab
111+
tab.add_handler(nodriver.cdp.network.RequestWillBeSent, self._send_handler)
112+
await tab.get('https://www.youtube.com/embed/jNQXAC9IVRw')
113+
player_clicked = await self._click_on_player(tab)
114+
if player_clicked:
115+
await self._wait_for_handler()
116+
await tab.close()
117+
browser.stop()
118+
119+
@staticmethod
120+
async def _click_on_player(tab: nodriver.Tab) -> bool:
121+
try:
122+
player = await tab.select('#movie_player', 10)
123+
except asyncio.TimeoutError:
124+
logger.warning('update failed: unable to locate video player on the page')
125+
return False
126+
else:
127+
await player.click()
128+
return True
129+
130+
async def _wait_for_handler(self) -> bool:
131+
try:
132+
await asyncio.wait_for(self._extraction_done.wait(), timeout=30)
133+
except asyncio.TimeoutError:
134+
logger.warning('update failed: timeout waiting for outgoing API request')
135+
return False
136+
else:
137+
logger.info('update was succeessful')
138+
return True
139+
140+
async def _send_handler(self, event: nodriver.cdp.network.RequestWillBeSent) -> None:
141+
if not event.request.method == 'POST':
142+
return
143+
if '/youtubei/v1/player' not in event.request.url:
144+
return
145+
token_info = self._extract_token(event.request)
146+
if token_info is None:
147+
return
148+
logger.info(f'new token: {token_info.to_json()}')
149+
self._token_info = token_info
150+
self._extraction_done.set()

potoken_generator/main.py

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
import argparse
2+
import asyncio
3+
import logging
4+
import sys
5+
from pathlib import Path
6+
from typing import Optional
7+
8+
import nodriver
9+
10+
from potoken_generator.extractor import PotokenExtractor, TokenInfo
11+
from potoken_generator.server import PotokenServer
12+
13+
logger = logging.getLogger('potoken')
14+
15+
16+
def print_token_and_exit(token_info: Optional[TokenInfo]):
17+
if token_info is None:
18+
logger.warning('failed to extract token')
19+
sys.exit(1)
20+
visitor_data = token_info.visitor_data
21+
po_token = token_info.potoken
22+
23+
print('visitor_data: ' + visitor_data)
24+
print('po_token: ' + po_token)
25+
if len(po_token) < 160:
26+
logger.warning("there is a high chance that the potoken generated won't work. Please try again on another internet connection")
27+
sys.exit(1)
28+
sys.exit(0)
29+
30+
31+
async def run(loop: asyncio.AbstractEventLoop, oneshot: bool,
32+
update_interval: int, bind_address: str, port: int,
33+
browser_path: Optional[Path] = None) -> None:
34+
potoken_extractor = PotokenExtractor(loop, update_interval=update_interval, browser_path=browser_path)
35+
token = await potoken_extractor.run_once()
36+
if oneshot:
37+
print_token_and_exit(token)
38+
39+
extractor_task = loop.create_task(potoken_extractor.run())
40+
potoken_server = PotokenServer(potoken_extractor, port=port, bind_address=bind_address)
41+
server_task = loop.create_task(asyncio.to_thread(potoken_server.run))
42+
43+
try:
44+
await asyncio.gather(extractor_task, server_task)
45+
except Exception:
46+
# exceptions raised by the tasks are intentionally propogated
47+
# to ensure process exit code is 1 on error
48+
raise
49+
except (KeyboardInterrupt, asyncio.CancelledError):
50+
logger.info('Stopping...')
51+
finally:
52+
potoken_server.stop()
53+
54+
55+
def set_logging(log_level: int = logging.DEBUG) -> None:
56+
log_format = '%(asctime)s.%(msecs)03d [%(name)s] [%(levelname)s] %(message)s'
57+
datefmt = '%Y/%m/%d %H:%M:%S'
58+
logging.basicConfig(level=log_level, format=log_format, datefmt=datefmt)
59+
logging.getLogger('asyncio').setLevel(logging.INFO)
60+
logging.getLogger('nodriver').setLevel(logging.WARNING)
61+
logging.getLogger('uc').setLevel(logging.WARNING)
62+
logging.getLogger('websockets').setLevel(logging.WARNING)
63+
64+
65+
def args_parse() -> argparse.Namespace:
66+
description = '''
67+
Retrieve potoken using Chromium runned by nodriver, serve it on a json endpoint
68+
69+
Token is generated on startup, and then every UPDATE_INTERVAL seconds.
70+
With web-server running on default port, the token is available on the
71+
http://127.0.0.1:8080/token endpoint. It is possible to request immediate
72+
token regeneration by accessing http://127.0.0.1:8080/update
73+
'''
74+
parser = argparse.ArgumentParser(description=description, formatter_class=argparse.RawDescriptionHelpFormatter)
75+
parser.add_argument('-o', '--oneshot', action='store_true', default=False,
76+
help='Do not start server. Generate token once, print it and exit')
77+
parser.add_argument('--update-interval', '-u', type=int, default=3600,
78+
help='How ofthen new token is generated, in seconds (default: %(default)s)')
79+
parser.add_argument('--port', '-p', type=int, default=8080,
80+
help='Port webserver is listening on (default: %(default)s)')
81+
parser.add_argument('--bind', '-b', default='127.0.0.1',
82+
help='Address webserver binds to (default: %(default)s)')
83+
parser.add_argument('--chrome-path', '-c', type=Path, default=None,
84+
help='Path to the Chromiun executable')
85+
return parser.parse_args()
86+
87+
88+
def main() -> None:
89+
args = args_parse()
90+
set_logging(logging.WARNING if args.oneshot else logging.INFO)
91+
loop = nodriver.loop()
92+
main_task = run(loop, oneshot=args.oneshot,
93+
update_interval=args.update_interval,
94+
bind_address=args.bind,
95+
port=args.port,
96+
browser_path=args.chrome_path
97+
)
98+
loop.run_until_complete(main_task)

0 commit comments

Comments
 (0)