Project

General

Profile

Developing with vesper » History » Version 3

jun chen, 03/10/2025 01:05 AM

1 1 jun chen
# Developing with vesper
2
3 2 jun chen
## Vepser introduction 
4 1 jun chen
5 3 jun chen
### 1. What is the vesper
6 1 jun chen
7 2 jun chen
Vepser is a python/c++ based platform for providing parallel computing service, 
8
which include a distributed file system(DFS), 
9
a dyanmic job scheduler and a dynamic resource scheduler.
10 1 jun chen
11 3 jun chen
### 2. When to use vesper
12 2 jun chen
vesper is recommand to used if there are some tasks could be split to run in parallel, or processing mass data computing.
13
14 3 jun chen
### 3. How to use vesper
15 2 jun chen
Any python script could be run with vesper after configure the python site-packages General way: Vesper xxx.py
16
Interactive mode: vesper -i xxx.py
17
18 3 jun chen
### 4. Scheduler/Worker Introduction
19 2 jun chen
Vesper will launch one scheduler and several workers
20
Scheduler: used to parse input script, schedule job and worker, so DONOT do heavy task on the scheduler!!!! worker: used to execute jobs only, could be launch in local machine and remote machines
21
22 3 jun chen
### 5. How to launch workers
23 2 jun chen
![](clipboard-202503100058-muhgk.png)
24
25
wait—for—workers is not required, but it is recommand if you want to all liobs start at the same ti me.
26
27 3 jun chen
### 6. Job introduction
28 2 jun chen
Job system is a DAG seems like
29 1 jun chen
![](clipboard-202503100058-ggac0.png)
30 2 jun chen
a job could be no any dependency or depend on some other jobs, it will be dispatched to workers immidiately if there is no dependecy else it will be dispatched after all dependency were ready.
31
the dependency is always a file list, a name node service used to guarantee the dependency system works well a job is a python function could be a job if which was decorated by as_job.
32
Any time when you want to run a task in a worker ,you should defined it as a job
33
Anywhere when you want to run a task which depend on another job, you should defined it as a job job has no any return values.
34 1 jun chen
![](clipboard-202503100058-g5jyn.png)
35 3 jun chen
36
### 7. MapReduce Introduction
37 1 jun chen
Mapreduce is a easy way to do some work in parallel(based on job system), it is recommend for simple work such as numerical computation.
38
Mapreduce always work on a collection object such as listidictituple.
39
![](clipboard-202503100058-idmq4.png)
40
41
![](clipboard-202503100058-qpcss.png)
42 3 jun chen
43
### 8.UserView introduction
44
UserView is a dict like object(key should be string) could be shared bit different workers, which seems like 
45
on scheduler:
46
```
47
uv = UserView("tmp") 
48
uv["key1"] = 999
49
```
50
on worker1: 
51
```
52
uv["key2"] = 9999 
53
```
54
on worker2 visit: 
55
```
56
key1 = uv["key1"] 
57
key2 = uv["key2"]
58
```
59 2 jun chen
60
## High level usage 
61
62
## Low level usage