Developing with vesper » History » Version 5
jun chen, 03/10/2025 01:08 AM
1 | 1 | jun chen | # Developing with vesper |
---|---|---|---|
2 | |||
3 | 4 | jun chen | {{toc}} |
4 | |||
5 | 2 | jun chen | ## Vepser introduction |
6 | 1 | jun chen | |
7 | 3 | jun chen | ### 1. What is the vesper |
8 | 1 | jun chen | |
9 | 2 | jun chen | Vepser is a python/c++ based platform for providing parallel computing service, |
10 | which include a distributed file system(DFS), |
||
11 | a dyanmic job scheduler and a dynamic resource scheduler. |
||
12 | 1 | jun chen | |
13 | 3 | jun chen | ### 2. When to use vesper |
14 | 2 | jun chen | vesper is recommand to used if there are some tasks could be split to run in parallel, or processing mass data computing. |
15 | |||
16 | 3 | jun chen | ### 3. How to use vesper |
17 | 2 | jun chen | Any python script could be run with vesper after configure the python site-packages General way: Vesper xxx.py |
18 | Interactive mode: vesper -i xxx.py |
||
19 | |||
20 | 3 | jun chen | ### 4. Scheduler/Worker Introduction |
21 | 2 | jun chen | Vesper will launch one scheduler and several workers |
22 | Scheduler: used to parse input script, schedule job and worker, so DONOT do heavy task on the scheduler!!!! worker: used to execute jobs only, could be launch in local machine and remote machines |
||
23 | |||
24 | 3 | jun chen | ### 5. How to launch workers |
25 | 2 | jun chen |  |
26 | |||
27 | wait—for—workers is not required, but it is recommand if you want to all liobs start at the same ti me. |
||
28 | |||
29 | 3 | jun chen | ### 6. Job introduction |
30 | 2 | jun chen | Job system is a DAG seems like |
31 | 1 | jun chen |  |
32 | 2 | jun chen | a job could be no any dependency or depend on some other jobs, it will be dispatched to workers immidiately if there is no dependecy else it will be dispatched after all dependency were ready. |
33 | the dependency is always a file list, a name node service used to guarantee the dependency system works well a job is a python function could be a job if which was decorated by as_job. |
||
34 | Any time when you want to run a task in a worker ,you should defined it as a job |
||
35 | Anywhere when you want to run a task which depend on another job, you should defined it as a job job has no any return values. |
||
36 | 1 | jun chen |  |
37 | 3 | jun chen | |
38 | ### 7. MapReduce Introduction |
||
39 | 1 | jun chen | Mapreduce is a easy way to do some work in parallel(based on job system), it is recommend for simple work such as numerical computation. |
40 | Mapreduce always work on a collection object such as listidictituple. |
||
41 |  |
||
42 | |||
43 |  |
||
44 | 3 | jun chen | |
45 | ### 8.UserView introduction |
||
46 | UserView is a dict like object(key should be string) could be shared bit different workers, which seems like |
||
47 | on scheduler: |
||
48 | ``` |
||
49 | uv = UserView("tmp") |
||
50 | uv["key1"] = 999 |
||
51 | ``` |
||
52 | on worker1: |
||
53 | ``` |
||
54 | uv["key2"] = 9999 |
||
55 | ``` |
||
56 | on worker2 visit: |
||
57 | ``` |
||
58 | key1 = uv["key1"] |
||
59 | key2 = uv["key2"] |
||
60 | ``` |
||
61 | 2 | jun chen | |
62 | 5 | jun chen | ## High level usage |
63 | |||
64 | ### Python level developing |
||
65 | In high level, only use python code to do parallel computation. |
||
66 | use MapReduce: |
||
67 | |||
68 |  |
||
69 | |||
70 | use Job: |
||
71 |  |
||
72 | |||
73 | 2 | jun chen | |
74 | ## Low level usage |