Project

General

Profile

Developing with vesper » History » Version 7

jun chen, 03/10/2025 01:09 AM

1 1 jun chen
# Developing with vesper
2
3 4 jun chen
{{toc}}
4
5 2 jun chen
## Vepser introduction 
6 1 jun chen
7 3 jun chen
### 1. What is the vesper
8 1 jun chen
9 2 jun chen
Vepser is a python/c++ based platform for providing parallel computing service, 
10
which include a distributed file system(DFS), 
11
a dyanmic job scheduler and a dynamic resource scheduler.
12 1 jun chen
13 3 jun chen
### 2. When to use vesper
14 2 jun chen
vesper is recommand to used if there are some tasks could be split to run in parallel, or processing mass data computing.
15
16 3 jun chen
### 3. How to use vesper
17 2 jun chen
Any python script could be run with vesper after configure the python site-packages General way: Vesper xxx.py
18
Interactive mode: vesper -i xxx.py
19
20 3 jun chen
### 4. Scheduler/Worker Introduction
21 2 jun chen
Vesper will launch one scheduler and several workers
22
Scheduler: used to parse input script, schedule job and worker, so DONOT do heavy task on the scheduler!!!! worker: used to execute jobs only, could be launch in local machine and remote machines
23
24 3 jun chen
### 5. How to launch workers
25 2 jun chen
![](clipboard-202503100058-muhgk.png)
26
27
wait—for—workers is not required, but it is recommand if you want to all liobs start at the same ti me.
28
29 3 jun chen
### 6. Job introduction
30 2 jun chen
Job system is a DAG seems like
31 1 jun chen
![](clipboard-202503100058-ggac0.png)
32 2 jun chen
a job could be no any dependency or depend on some other jobs, it will be dispatched to workers immidiately if there is no dependecy else it will be dispatched after all dependency were ready.
33
the dependency is always a file list, a name node service used to guarantee the dependency system works well a job is a python function could be a job if which was decorated by as_job.
34
Any time when you want to run a task in a worker ,you should defined it as a job
35
Anywhere when you want to run a task which depend on another job, you should defined it as a job job has no any return values.
36 1 jun chen
![](clipboard-202503100058-g5jyn.png)
37 3 jun chen
38
### 7. MapReduce Introduction
39 1 jun chen
Mapreduce is a easy way to do some work in parallel(based on job system), it is recommend for simple work such as numerical computation.
40
Mapreduce always work on a collection object such as listidictituple.
41
![](clipboard-202503100058-idmq4.png)
42
43
![](clipboard-202503100058-qpcss.png)
44 3 jun chen
45
### 8.UserView introduction
46
UserView is a dict like object(key should be string) could be shared bit different workers, which seems like 
47
on scheduler:
48
```
49
uv = UserView("tmp") 
50
uv["key1"] = 999
51
```
52
on worker1: 
53
```
54
uv["key2"] = 9999 
55
```
56
on worker2 visit: 
57
```
58
key1 = uv["key1"] 
59
key2 = uv["key2"]
60
```
61 2 jun chen
62 6 jun chen
----------------------------------------
63 5 jun chen
## High level usage
64
65
### Python level developing
66
In high level, only use python code to do parallel computation. 
67
use MapReduce:
68
69
![](clipboard-202503100107-v8oow.png)
70
71
use Job:
72
![](clipboard-202503100108-9ovps.png)
73
74 6 jun chen
--------------------------------------
75 2 jun chen
76
## Low level usage
77 7 jun chen
78
### Mixed Python/C++ Developing
79
1.Code Structure
80
![](clipboard-202503100109-hlymj.png)
81
82
2.Compile System
83
![](clipboard-202503100109-zn5xz.png)