Tomato Time Estimation, Time Plan
PAPERS & WRITING #
Action | Time | Description & Notice |
---|---|---|
self-archive/publish 1 paper full-text online | 2+ tomato | 1h+, w/o code, GitHub + RG + Google scholar etc. |
converting 1 paper (A4, 2col, 11pages) from MS word to Latex | 13 tomato | 1 ~ 1.5 day |
reading 1 paper (qiqqa) | 1 tomato | average time for NEW papers, regardless of scanning or deep reading. |
scanning 3 papers (qiqqa) | 1 tomato | judge the paper relative or not, keywords & key procedure & key conclusion, citation points. |
deep reading 1 paper (qiqqa) | 3 tomatoes | at least 3. |
self revision of draft > quick mark & correction (real pen) | 2 tomatoes | 19 pages (w/o bibliography). |
self revision of draft > quick apply correction (latex) | 4 tomatoes | 1 more is needed if also correcting newly found small issues (eg. typo) during quick apply. 19 pages (w/o bibliography). |
detailed self revision of draft > mark & correction both structure & language (real pen) | 11 tomatoes | 26 pages (w/o bibliography). get bored after 5 tomatoes. +7 tomatoes if also some content-wise stuff. |
update of GPU draft result section due to exp update > mark & correct & compare with old results | x tomatoes | boring cuz no big difference, used a lot of time. |
update of GPU draft analysis section due to exp results update > mark & correct | 10 tomatoes | 2 full-text pages. structure & language & compare new results with literature. (kind of rewrite). |
SARIMA: math, code, debug and parallel | 32 tomatoes | fully 4 days; including 5 tomatoes for debugging. |
SARIMA: parallel: try & debug | 5 tomatoes | first time to implement parallel program in R. Did in 3 different ways (doParallel+foreach; parallel+ |
Techila* platform: usage basics, try examples, implement my solution & debug | 19 tomatoes | 1 tomato to install; 3 tomatoes to try different official examples and to decide solution:foreach(); 5 tomatoes to learn & implement my solution using foreach(); 9 tomatoes to debug Techila’s own problems (eg. how to use lib, how to upload data in tricky way); have not tried to use Techila’s own way to upload data. |
*:
[Techila] Good doc, very easy to follow the manual/tutorial to start the official examples in Google Cloud Platform, but needs some time to make own solution to run.
EXPERIMENTS, VISUALIZATION & CODING #
Action | Time | Description & Notice |
---|---|---|
learning Django MVC from a good video tutorial | 18 tomatoes (est.) | 2.5 hours video (ps) => toggl 11 hours. only learned basic; don’t know why; notes in blog; (bg:already knew Zend MVC, don’t know python.) |
learning Django rest_framework from two videos | 18 tomatoes (est.) | 1 hour, 2 videos => toggl 10 hours; with non-clear explanation + only partial code, it took longer time to follow and understand; started to have feeling about Django and its rest API; notes in blog. |
learning Model Form (inc. create/update/delete) | 8 hours toggl | inc. 1 hour to find the right tutorial videos. |
learning, trying & comparing different ways of uploading file(s) | 4 hours toggl | django: function-based & class-based views. |
change project to new IDE | 6 tomatoes | install jupyter & r-dependencies, setup mandatory options, try improve other options, ok to use, know basic shortcuts, but not so familar with new jupyter environment. |
HOW SPARK (BASIC) TIME PLAN FAILED #
In mid-January, I planned to learn Spark basics and deploy it on standalone mode & mesos within 4 weeks.
before #
Before this time plan, I have spent 72 hours for hardware & system installation.
Summary:
27 hours to learn & make auto installation. (a better automatic method PXE can be learned from a teacher)
17 hours to learn X & enable remote X.
10 hours to organize hardware, such as organizing pc cases, tables & cables.
Big Data and Cloud Computing | 72 hours |
---|---|
Requirement Meeting | 00:30:25 |
Spark > Meeting | 01:34:00 |
Spark > Enable Router & Remote | 10:56:44 |
Spark > Hardware | 09:28:53 |
Spark > Integrate Matlab | 01:40:29 |
Spark > Mess | 01:31:20 |
Spark > Plan | 02:03:40 |
Spark > System | 11:21:47 |
Spark > System > Auto Install | 12:52:49 |
Spark > System > Debug | 01:07:23 |
Spark > System > LVM Partitions | 00:28:25 |
Spark > System > Server Terminal | 03:00:00 |
Spark > System > X | 07:13:22 |
Spark > Vmware | 04:28:53 |
Spark > Vmware > Matlab Headless | 01:15:03 |
Spark > Vmware > SSH key problem | 02:45:09 |
original plan #
Week 3: install base systems (ubuntu).
Week 4: hello world (word count) in virtual machines. (+ yarn, I thought yarn is mandatory)
Week 5: mesos + bind (self-hosted DNS).
Week 6: deploy on real machines.
results #
Week 3: 40 hours on Spark.
11 hours for hardware (dirty cables).
23 (+5?) hours for setting up virtual environment inc. file sharing.
6.5 hours for surfing info.
Sometimes lacking of efficiency (e.g. 6.5 hours surfing).
Big Data and Cloud Computing | 40 hours |
---|---|
Spark > DataBricks; Github Info and MOOC etc. | 06:32:00 |
Spark > Dirty Cables | 09:04:16 |
Spark > Hardware > Cables | 02:05:02 |
Spark > New VirtualMachine > Remote Work @ Home | 05:17:00 |
Spark > New VirtualMachine > VirtualBox | 01:03:52 |
Spark > New VirtualMachine > Vmware Host-Guest Share Files | 02:00:00 |
Spark > Puppet/Ansible | 01:57:17 |
Spark > Share File | 05:32:21 |
Spark > Windows for VirtualMachine | 04:29:35 |
Week 4: nothing on Spark.
Week 5: 25 hours on Spark [totally 50 hours].
7 hours on hardware;
15 hours to learn SaclePy coding (& book)
Big Data and Cloud Computing | 25 hours |
---|---|
Spark > Scale Py > Book Only (Accumulated Time) | 03:00:00 |
Spark > Scale Py > Ch1 | 03:28:29 |
Spark > Scale Py > Ch2 | 02:01:14 |
Spark > Scale Py > Plan (Accumulated Time) | 02:00:00 |
Spark > Hardware | 07:04:18 |
Spark > Scale Py > Remote Jupyter | 03:22:25 |
W > T’s Friend Temp Computer > Reset | 03:48:52 |
Week 6: nothing series on Spark. Tried 1 hour to active windows and failed.
Week 7: 10 hours to get the environment kind of ready.
Big Data and Cloud Computing | 10 hours |
---|---|
Spark > Scale Py > DL | 01:00:00 |
Spark > Hardware > Dirty Cables | 01:33:34 |
Spark > Software and Hardware > Network | 03:37:00 |
Spark > Software > Network | 01:48:10 |
Spark > Software > Puppet | 01:39:05 |
Spark > Software > System | 30:00 min |
Week 8: 7.5 hours to learn SaclePy coding & book.
Big Data and Cloud Computing | 7.5 hours |
---|---|
Spark > Scale Py > Ch2 | 06:30:14 |
Spark > Scale Py > Ch2 > Book Only | 20:18 min |
Spark > Scale Py > Ch9 > Book Only | 22:18 min |
Week 9: 85 hours, and got Spark standalone running.
24 hours to learn SaclePy coding & book.
17 hours to set up Spark standalone.
10 hours to disable X to have more resource.
Big Data and Cloud Computing | 85 hours |
---|---|
DS Webinar | 01:15:00 |
Spark > Scale Py > Ch8 | 04:58:00 |
Spark > Scale Py > Ch8 > Book Only | 01:47:00 |
Spark > 1st Performance Test > Prepare System | 07:28:27 |
Spark > DNS | 02:03:17 |
Spark > Hardware and Software > Network | 01:13:56 |
Spark > Scale Py > Ch8 | 07:41:27 |
Spark > Scale Py > Ch9 | 06:03:05 |
Spark > Scale Py > Ch9 > Read Book and Debug of: convert features_header to list | 02:14:07 |
Spark > Scale Py > Ch9 > Understanding | 02:00:00 |
Spark > Self Evaluation, Review, Summary | 59:43 min |
Spark > Standalone in VirtualBox | 07:51:24 |
Spark > System > Disable X | 02:26:12 |
Spark > System > Disable X & Auto Install Spark | 06:19:00 |
Spark > System > Disable X > Review Basic Linux | 54:09 min |
other time was used for #
Week 3: 20 hours to write paper.
Week 4: 30 hours to revise paper. [totally 40 hours]. (Friday is Spring Festival).
Week 5: 10 hours to write paper, 10 hours TA. [totally 50 hours]. finally found the right book/material.
Week 6: 25 hours to write paper, 20 hours TA, [totally 45 hours]. 1 hour to active Spark windows (failed).
Week 7: 45 hours to write paper, 20 hours TA, 7 hours to welcome professor [totally 45 hours].
Week 8: 10 hours for ISP, 5 hours for department meeting, 17 hours to welcome professor 22 hours for QDA. [totally 60 hours].
Week 9: 5 hours to welcome professor, 5 hours to another city campus, 4 hours for QDA. [totally 73 hours].
analysis of problems (reasons for the delay) #
Totally 168 hours was spent until Spark standalone is running.
0. Did not estimate the time in the right way. For key tasks, only 8 hours is needed to setup standalone (virtual & real PCs), with 48 hours reading & coding Scale Py book. Learn: 48 hours + Final setup: 8 hours.
- 12.5 hours on dirty cables is not necessary. (at least, due to 7 hours for “hardware” is not clear and not included).
- 30 hours on automatic installation can be reduced, (at least half? = 15 hours). Thus, 168 hours becomes 140 hours (15% time saved).
- In addition, if I knew Matlab can run without screen, everything related with X can be discared (27 hours).
Therefore, 168 can be reduced to ≈120 and 1/3 of time can be saved in total.
solutions #
- More detailed plan (in hours): (learn theory & coding, according to a practial book) + final setup; the same time for surrending work (app running env & sys); add 1/3 for extra unexpected wasting. Youtube video watching is not included.
- Ask for help and give the rights. 放权,不要以为所有事情都是自己做得最好,团队合作才能 1+1 » 2, 才能合理分配时间。
- Ask experts & teachers for help. It is not manadatory to learn everything, even if I want to learn details, experts usually provide better methods.
- It is kind of for sure that Matlab can run without screen, I should confirm it first and develop the system with some limitations (no screen) to cut unnecessary time. If the stakeholders do want it later, I can spend extra time to develop this feature.
PS #
- The paper reading time is for qiqqa reading & noting.
- Sentdex’s video is 3 hours, but 2.5 hours without server & ssl part.
- est. :+ estimated, when lacking of tomato log.
- the tomatoes in tables are shown standard tomatoes (1 tomato = 25min + 5min), if the actual recored time is not 25min, it will be translated to standarded values. l be translated to standarded values.