Tomato Time Estimation, Time Plan

Tomato Time Estimation, Time Plan

2016-08-22. Category & Tags: Tomato, Time Management, Write, Read

PAPERS & WRITING #

Action	Time	Description & Notice
self-archive/publish 1 paper full-text online	2+ tomato	1h+, w/o code, GitHub + RG + Google scholar etc.
converting 1 paper (A4, 2col, 11pages) from MS word to Latex	13 tomato	1 ~ 1.5 day
reading 1 paper (qiqqa)	1 tomato	average time for NEW papers, regardless of scanning or deep reading.
scanning 3 papers (qiqqa)	1 tomato	judge the paper relative or not, keywords & key procedure & key conclusion, citation points.
deep reading 1 paper (qiqqa)	3 tomatoes	at least 3.
self revision of draft > quick mark & correction (real pen)	2 tomatoes	19 pages (w/o bibliography).
self revision of draft > quick apply correction (latex)	4 tomatoes	1 more is needed if also correcting newly found small issues (eg. typo) during quick apply. 19 pages (w/o bibliography).
detailed self revision of draft > mark & correction both structure & language (real pen)	11 tomatoes	26 pages (w/o bibliography). get bored after 5 tomatoes. +7 tomatoes if also some content-wise stuff.
update of GPU draft result section due to exp update > mark & correct & compare with old results	x tomatoes	boring cuz no big difference, used a lot of time.
update of GPU draft analysis section due to exp results update > mark & correct	10 tomatoes	2 full-text pages. structure & language & compare new results with literature. (kind of rewrite).
SARIMA: math, code, debug and parallel	32 tomatoes	fully 4 days; including 5 tomatoes for debugging.
SARIMA: parallel: try & debug	5 tomatoes	first time to implement parallel program in R. Did in 3 different ways (doParallel+foreach; parallel+, eg.parApply; techila).
Techila* platform: usage basics, try examples, implement my solution & debug	19 tomatoes	1 tomato to install; 3 tomatoes to try different official examples and to decide solution:foreach(); 5 tomatoes to learn & implement my solution using foreach(); 9 tomatoes to debug Techila’s own problems (eg. how to use lib, how to upload data in tricky way); have not tried to use Techila’s own way to upload data.

*:
[Techila] Good doc, very easy to follow the manual/tutorial to start the official examples in Google Cloud Platform, but needs some time to make own solution to run.

EXPERIMENTS, VISUALIZATION & CODING #

Action	Time	Description & Notice
learning Django MVC from a good video tutorial	18 tomatoes (est.)	2.5 hours video (ps) => toggl 11 hours. only learned basic; don’t know why; notes in blog; (bg:already knew Zend MVC, don’t know python.)
learning Django rest_framework from two videos	18 tomatoes (est.)	1 hour, 2 videos => toggl 10 hours; with non-clear explanation + only partial code, it took longer time to follow and understand; started to have feeling about Django and its rest API; notes in blog.
learning Model Form (inc. create/update/delete)	8 hours toggl	inc. 1 hour to find the right tutorial videos.
learning, trying & comparing different ways of uploading file(s)	4 hours toggl	django: function-based & class-based views.
change project to new IDE	6 tomatoes	install jupyter & r-dependencies, setup mandatory options, try improve other options, ok to use, know basic shortcuts, but not so familar with new jupyter environment.

HOW SPARK (BASIC) TIME PLAN FAILED #

In mid-January, I planned to learn Spark basics and deploy it on standalone mode & mesos within 4 weeks.

before #

Before this time plan, I have spent 72 hours for hardware & system installation.

Summary: 27 hours to learn & make auto installation. (a better automatic method PXE can be learned from a teacher)
17 hours to learn X & enable remote X.
10 hours to organize hardware, such as organizing pc cases, tables & cables.

Big Data and Cloud Computing	72 hours
Requirement Meeting	00:30:25
Spark > Meeting	01:34:00
Spark > Enable Router & Remote	10:56:44
Spark > Hardware	09:28:53
Spark > Integrate Matlab	01:40:29
Spark > Mess	01:31:20
Spark > Plan	02:03:40
Spark > System	11:21:47
Spark > System > Auto Install	12:52:49
Spark > System > Debug	01:07:23
Spark > System > LVM Partitions	00:28:25
Spark > System > Server Terminal	03:00:00
Spark > System > X	07:13:22
Spark > Vmware	04:28:53
Spark > Vmware > Matlab Headless	01:15:03
Spark > Vmware > SSH key problem	02:45:09

original plan #

Week 3: install base systems (ubuntu).
Week 4: hello world (word count) in virtual machines. (+ yarn, I thought yarn is mandatory)
Week 5: mesos + bind (self-hosted DNS).
Week 6: deploy on real machines.

results #

Week 3: 40 hours on Spark.
11 hours for hardware (dirty cables).
23 (+5?) hours for setting up virtual environment inc. file sharing.
6.5 hours for surfing info.
Sometimes lacking of efficiency (e.g. 6.5 hours surfing).

Big Data and Cloud Computing	40 hours
Spark > DataBricks; Github Info and MOOC etc.	06:32:00
Spark > Dirty Cables	09:04:16
Spark > Hardware > Cables	02:05:02
Spark > New VirtualMachine > Remote Work @ Home	05:17:00
Spark > New VirtualMachine > VirtualBox	01:03:52
Spark > New VirtualMachine > Vmware Host-Guest Share Files	02:00:00
Spark > Puppet/Ansible	01:57:17
Spark > Share File	05:32:21
Spark > Windows for VirtualMachine	04:29:35

Week 4: nothing on Spark.
Week 5: 25 hours on Spark [totally 50 hours].
7 hours on hardware;
15 hours to learn SaclePy coding (& book)

Big Data and Cloud Computing	25 hours
Spark > Scale Py > Book Only (Accumulated Time)	03:00:00
Spark > Scale Py > Ch1	03:28:29
Spark > Scale Py > Ch2	02:01:14
Spark > Scale Py > Plan (Accumulated Time)	02:00:00
Spark > Hardware	07:04:18
Spark > Scale Py > Remote Jupyter	03:22:25
W > T’s Friend Temp Computer > Reset	03:48:52

Week 6: nothing series on Spark. Tried 1 hour to active windows and failed.
Week 7: 10 hours to get the environment kind of ready.

Big Data and Cloud Computing	10 hours
Spark > Scale Py > DL	01:00:00
Spark > Hardware > Dirty Cables	01:33:34
Spark > Software and Hardware > Network	03:37:00
Spark > Software > Network	01:48:10
Spark > Software > Puppet	01:39:05
Spark > Software > System	30:00 min

Week 8: 7.5 hours to learn SaclePy coding & book.

Big Data and Cloud Computing	7.5 hours
Spark > Scale Py > Ch2	06:30:14
Spark > Scale Py > Ch2 > Book Only	20:18 min
Spark > Scale Py > Ch9 > Book Only	22:18 min

Week 9: 85 hours, and got Spark standalone running.
24 hours to learn SaclePy coding & book.
17 hours to set up Spark standalone.
10 hours to disable X to have more resource.

Big Data and Cloud Computing	85 hours
DS Webinar	01:15:00
Spark > Scale Py > Ch8	04:58:00
Spark > Scale Py > Ch8 > Book Only	01:47:00
Spark > 1st Performance Test > Prepare System	07:28:27
Spark > DNS	02:03:17
Spark > Hardware and Software > Network	01:13:56
Spark > Scale Py > Ch8	07:41:27
Spark > Scale Py > Ch9	06:03:05
Spark > Scale Py > Ch9 > Read Book and Debug of: convert features_header to list	02:14:07
Spark > Scale Py > Ch9 > Understanding	02:00:00
Spark > Self Evaluation, Review, Summary	59:43 min
Spark > Standalone in VirtualBox	07:51:24
Spark > System > Disable X	02:26:12
Spark > System > Disable X & Auto Install Spark	06:19:00
Spark > System > Disable X > Review Basic Linux	54:09 min

other time was used for #

Week 3: 20 hours to write paper.
Week 4: 30 hours to revise paper. [totally 40 hours]. (Friday is Spring Festival).
Week 5: 10 hours to write paper, 10 hours TA. [totally 50 hours]. finally found the right book/material.
Week 6: 25 hours to write paper, 20 hours TA, [totally 45 hours]. 1 hour to active Spark windows (failed).
Week 7: 45 hours to write paper, 20 hours TA, 7 hours to welcome professor [totally 45 hours].
Week 8: 10 hours for ISP, 5 hours for department meeting, 17 hours to welcome professor 22 hours for QDA. [totally 60 hours].
Week 9: 5 hours to welcome professor, 5 hours to another city campus, 4 hours for QDA. [totally 73 hours].

analysis of problems (reasons for the delay) #

Totally 168 hours was spent until Spark standalone is running.
0. Did not estimate the time in the right way. For key tasks, only 8 hours is needed to setup standalone (virtual & real PCs), with 48 hours reading & coding Scale Py book. Learn: 48 hours + Final setup: 8 hours.

12.5 hours on dirty cables is not necessary. (at least, due to 7 hours for “hardware” is not clear and not included).
30 hours on automatic installation can be reduced, (at least half? = 15 hours). Thus, 168 hours becomes 140 hours (15% time saved).
In addition, if I knew Matlab can run without screen, everything related with X can be discared (27 hours).
Therefore, 168 can be reduced to ≈120 and 1/3 of time can be saved in total.

solutions #

More detailed plan (in hours): (learn theory & coding, according to a practial book) + final setup; the same time for surrending work (app running env & sys); add 1/3 for extra unexpected wasting. Youtube video watching is not included.
Ask for help and give the rights. 放权，不要以为所有事情都是自己做得最好，团队合作才能 1+1 » 2，才能合理分配时间。
Ask experts & teachers for help. It is not manadatory to learn everything, even if I want to learn details, experts usually provide better methods.
It is kind of for sure that Matlab can run without screen, I should confirm it first and develop the system with some limitations (no screen) to cut unnecessary time. If the stakeholders do want it later, I can spend extra time to develop this feature.

PS #

The paper reading time is for qiqqa reading & noting.
Sentdex’s video is 3 hours, but 2.5 hours without server & ssl part.
est. :+ estimated, when lacking of tomato log.
the tomatoes in tables are shown standard tomatoes (1 tomato = 25min + 5min), if the actual recored time is not 25min, it will be translated to standarded values. l be translated to standarded values.