Tomato Time Estimation, Time Plan

Tomato Time Estimation, Time Plan

2016-08-22. Category & Tags: Tomato, Time Management, Write, Read

PAPERS & WRITING #

Action Time Description & Notice
self-archive/publish 1 paper full-text online 2+ tomato 1h+, w/o code, GitHub + RG + Google scholar etc.
converting 1 paper (A4, 2col, 11pages) from MS word to Latex 13 tomato 1 ~ 1.5 day
reading 1 paper (qiqqa) 1 tomato average time for NEW papers, regardless of scanning or deep reading.
scanning 3 papers (qiqqa) 1 tomato judge the paper relative or not, keywords & key procedure & key conclusion, citation points.
deep reading 1 paper (qiqqa) 3 tomatoes at least 3.
self revision of draft > quick mark & correction (real pen) 2 tomatoes 19 pages (w/o bibliography).
self revision of draft > quick apply correction (latex) 4 tomatoes 1 more is needed if also correcting newly found small issues (eg. typo) during quick apply. 19 pages (w/o bibliography).
detailed self revision of draft > mark & correction both structure & language (real pen) 11 tomatoes 26 pages (w/o bibliography). get bored after 5 tomatoes. +7 tomatoes if also some content-wise stuff.
update of GPU draft result section due to exp update > mark & correct & compare with old results x tomatoes boring cuz no big difference, used a lot of time.
update of GPU draft analysis section due to exp results update > mark & correct 10 tomatoes 2 full-text pages. structure & language & compare new results with literature. (kind of rewrite).
SARIMA: math, code, debug and parallel 32 tomatoes fully 4 days; including 5 tomatoes for debugging.
SARIMA: parallel: try & debug 5 tomatoes first time to implement parallel program in R. Did in 3 different ways (doParallel+foreach; parallel+, eg.parApply; techila).
Techila* platform: usage basics, try examples, implement my solution & debug 19 tomatoes 1 tomato to install; 3 tomatoes to try different official examples and to decide solution:foreach(); 5 tomatoes to learn & implement my solution using foreach(); 9 tomatoes to debug Techila’s own problems (eg. how to use lib, how to upload data in tricky way); have not tried to use Techila’s own way to upload data.

*:
[Techila] Good doc, very easy to follow the manual/tutorial to start the official examples in Google Cloud Platform, but needs some time to make own solution to run.

EXPERIMENTS, VISUALIZATION & CODING #

Action Time Description & Notice
learning Django MVC from a good video tutorial 18 tomatoes (est.) 2.5 hours video (ps) => toggl 11 hours. only learned basic; don’t know why; notes in blog; (bg:already knew Zend MVC, don’t know python.)
learning Django rest_framework from two videos 18 tomatoes (est.) 1 hour, 2 videos => toggl 10 hours; with non-clear explanation + only partial code, it took longer time to follow and understand; started to have feeling about Django and its rest API; notes in blog.
learning Model Form (inc. create/update/delete) 8 hours toggl inc. 1 hour to find the right tutorial videos.
learning, trying & comparing different ways of uploading file(s) 4 hours toggl django: function-based & class-based views.
change project to new IDE 6 tomatoes install jupyter & r-dependencies, setup mandatory options, try improve other options, ok to use, know basic shortcuts, but not so familar with new jupyter environment.

HOW SPARK (BASIC) TIME PLAN FAILED #

In mid-January, I planned to learn Spark basics and deploy it on standalone mode & mesos within 4 weeks.

before #

Before this time plan, I have spent 72 hours for hardware & system installation.

Summary: 27 hours to learn & make auto installation. (a better automatic method PXE can be learned from a teacher)
17 hours to learn X & enable remote X.
10 hours to organize hardware, such as organizing pc cases, tables & cables.

Big Data and Cloud Computing 72 hours
Requirement Meeting 00:30:25
Spark > Meeting 01:34:00
Spark > Enable Router & Remote 10:56:44
Spark > Hardware 09:28:53
Spark > Integrate Matlab 01:40:29
Spark > Mess 01:31:20
Spark > Plan 02:03:40
Spark > System 11:21:47
Spark > System > Auto Install 12:52:49
Spark > System > Debug 01:07:23
Spark > System > LVM Partitions 00:28:25
Spark > System > Server Terminal 03:00:00
Spark > System > X 07:13:22
Spark > Vmware 04:28:53
Spark > Vmware > Matlab Headless 01:15:03
Spark > Vmware > SSH key problem 02:45:09

original plan #

Week 3: install base systems (ubuntu).
Week 4: hello world (word count) in virtual machines. (+ yarn, I thought yarn is mandatory)
Week 5: mesos + bind (self-hosted DNS).
Week 6: deploy on real machines.

results #

Week 3: 40 hours on Spark.
11 hours for hardware (dirty cables).
23 (+5?) hours for setting up virtual environment inc. file sharing.
6.5 hours for surfing info.
Sometimes lacking of efficiency (e.g. 6.5 hours surfing).

Big Data and Cloud Computing 40 hours
Spark > DataBricks; Github Info and MOOC etc. 06:32:00
Spark > Dirty Cables 09:04:16
Spark > Hardware > Cables 02:05:02
Spark > New VirtualMachine > Remote Work @ Home 05:17:00
Spark > New VirtualMachine > VirtualBox 01:03:52
Spark > New VirtualMachine > Vmware Host-Guest Share Files 02:00:00
Spark > Puppet/Ansible 01:57:17
Spark > Share File 05:32:21
Spark > Windows for VirtualMachine 04:29:35

Week 4: nothing on Spark.
Week 5: 25 hours on Spark [totally 50 hours].
7 hours on hardware;
15 hours to learn SaclePy coding (& book)

Big Data and Cloud Computing 25 hours
Spark > Scale Py > Book Only (Accumulated Time) 03:00:00
Spark > Scale Py > Ch1 03:28:29
Spark > Scale Py > Ch2 02:01:14
Spark > Scale Py > Plan (Accumulated Time) 02:00:00
Spark > Hardware 07:04:18
Spark > Scale Py > Remote Jupyter 03:22:25
W > T’s Friend Temp Computer > Reset 03:48:52

Week 6: nothing series on Spark. Tried 1 hour to active windows and failed.
Week 7: 10 hours to get the environment kind of ready.

Big Data and Cloud Computing 10 hours
Spark > Scale Py > DL 01:00:00
Spark > Hardware > Dirty Cables 01:33:34
Spark > Software and Hardware > Network 03:37:00
Spark > Software > Network 01:48:10
Spark > Software > Puppet 01:39:05
Spark > Software > System 30:00 min

Week 8: 7.5 hours to learn SaclePy coding & book.

Big Data and Cloud Computing 7.5 hours
Spark > Scale Py > Ch2 06:30:14
Spark > Scale Py > Ch2 > Book Only 20:18 min
Spark > Scale Py > Ch9 > Book Only 22:18 min

Week 9: 85 hours, and got Spark standalone running.
24 hours to learn SaclePy coding & book.
17 hours to set up Spark standalone.
10 hours to disable X to have more resource.

Big Data and Cloud Computing 85 hours
DS Webinar 01:15:00
Spark > Scale Py > Ch8 04:58:00
Spark > Scale Py > Ch8 > Book Only 01:47:00
Spark > 1st Performance Test > Prepare System 07:28:27
Spark > DNS 02:03:17
Spark > Hardware and Software > Network 01:13:56
Spark > Scale Py > Ch8 07:41:27
Spark > Scale Py > Ch9 06:03:05
Spark > Scale Py > Ch9 > Read Book and Debug of: convert features_header to list 02:14:07
Spark > Scale Py > Ch9 > Understanding 02:00:00
Spark > Self Evaluation, Review, Summary 59:43 min
Spark > Standalone in VirtualBox 07:51:24
Spark > System > Disable X 02:26:12
Spark > System > Disable X & Auto Install Spark 06:19:00
Spark > System > Disable X > Review Basic Linux 54:09 min

other time was used for #

Week 3: 20 hours to write paper.
Week 4: 30 hours to revise paper. [totally 40 hours]. (Friday is Spring Festival).
Week 5: 10 hours to write paper, 10 hours TA. [totally 50 hours]. finally found the right book/material.
Week 6: 25 hours to write paper, 20 hours TA, [totally 45 hours]. 1 hour to active Spark windows (failed).
Week 7: 45 hours to write paper, 20 hours TA, 7 hours to welcome professor [totally 45 hours].
Week 8: 10 hours for ISP, 5 hours for department meeting, 17 hours to welcome professor 22 hours for QDA. [totally 60 hours].
Week 9: 5 hours to welcome professor, 5 hours to another city campus, 4 hours for QDA. [totally 73 hours].

analysis of problems (reasons for the delay) #

Totally 168 hours was spent until Spark standalone is running.
0. Did not estimate the time in the right way. For key tasks, only 8 hours is needed to setup standalone (virtual & real PCs), with 48 hours reading & coding Scale Py book. Learn: 48 hours + Final setup: 8 hours.

  1. 12.5 hours on dirty cables is not necessary. (at least, due to 7 hours for “hardware” is not clear and not included).
  2. 30 hours on automatic installation can be reduced, (at least half? = 15 hours). Thus, 168 hours becomes 140 hours (15% time saved).
  3. In addition, if I knew Matlab can run without screen, everything related with X can be discared (27 hours).
    Therefore, 168 can be reduced to ≈120 and 1/3 of time can be saved in total.

solutions #

  1. More detailed plan (in hours): (learn theory & coding, according to a practial book) + final setup; the same time for surrending work (app running env & sys); add 1/3 for extra unexpected wasting. Youtube video watching is not included.
  2. Ask for help and give the rights. 放权,不要以为所有事情都是自己做得最好,团队合作才能 1+1 » 2, 才能合理分配时间。
  3. Ask experts & teachers for help. It is not manadatory to learn everything, even if I want to learn details, experts usually provide better methods.
  4. It is kind of for sure that Matlab can run without screen, I should confirm it first and develop the system with some limitations (no screen) to cut unnecessary time. If the stakeholders do want it later, I can spend extra time to develop this feature.

PS #

  • The paper reading time is for qiqqa reading & noting.
  • Sentdex’s video is 3 hours, but 2.5 hours without server & ssl part.
  • est. :+ estimated, when lacking of tomato log.
  • the tomatoes in tables are shown standard tomatoes (1 tomato = 25min + 5min), if the actual recored time is not 25min, it will be translated to standarded values. l be translated to standarded values.