PAPERS & WRITING #
|Action||Time||Description & Notice|
|self-archive/publish 1 paper full-text online||2+ tomato||1h+, w/o code, GitHub + RG + Google scholar etc.|
|converting 1 paper (A4, 2col, 11pages) from MS word to Latex||13 tomato||1 ~ 1.5 day|
|reading 1 paper (qiqqa)||1 tomato||average time for NEW papers, regardless of scanning or deep reading.|
|scanning 3 papers (qiqqa)||1 tomato||judge the paper relative or not, keywords & key procedure & key conclusion, citation points.|
|deep reading 1 paper (qiqqa)||3 tomatoes||at least 3.|
|self revision of draft > quick mark & correction (real pen)||2 tomatoes||19 pages (w/o bibliography).|
|self revision of draft > quick apply correction (latex)||4 tomatoes||1 more is needed if also correcting newly found small issues (eg. typo) during quick apply. 19 pages (w/o bibliography).|
|detailed self revision of draft > mark & correction both structure & language (real pen)||11 tomatoes||26 pages (w/o bibliography). get bored after 5 tomatoes. +7 tomatoes if also some content-wise stuff.|
|update of GPU draft result section due to exp update > mark & correct & compare with old results||x tomatoes||boring cuz no big difference, used a lot of time.|
|update of GPU draft analysis section due to exp results update > mark & correct||10 tomatoes||2 full-text pages. structure & language & compare new results with literature. (kind of rewrite).|
|SARIMA: math, code, debug and parallel||32 tomatoes||fully 4 days; including 5 tomatoes for debugging.|
|SARIMA: parallel: try & debug||5 tomatoes||first time to implement parallel program in R. Did in 3 different ways (doParallel+foreach; parallel+
|Techila* platform: usage basics, try examples, implement my solution & debug||19 tomatoes||1 tomato to install; 3 tomatoes to try different official examples and to decide solution:foreach(); 5 tomatoes to learn & implement my solution using foreach(); 9 tomatoes to debug Techila’s own problems (eg. how to use lib, how to upload data in tricky way); have not tried to use Techila’s own way to upload data.|
[Techila] Good doc, very easy to follow the manual/tutorial to start the official examples in Google Cloud Platform, but needs some time to make own solution to run.
EXPERIMENTS, VISUALIZATION & CODING #
|Action||Time||Description & Notice|
|learning Django MVC from a good video tutorial||18 tomatoes (est.)||2.5 hours video (ps) => toggl 11 hours. only learned basic; don’t know why; notes in blog; (bg:already knew Zend MVC, don’t know python.)|
|learning Django rest_framework from two videos||18 tomatoes (est.)||1 hour, 2 videos => toggl 10 hours; with non-clear explanation + only partial code, it took longer time to follow and understand; started to have feeling about Django and its rest API; notes in blog.|
|learning Model Form (inc. create/update/delete)||8 hours toggl||inc. 1 hour to find the right tutorial videos.|
|learning, trying & comparing different ways of uploading file(s)||4 hours toggl||django: function-based & class-based views.|
|change project to new IDE||6 tomatoes||install jupyter & r-dependencies, setup mandatory options, try improve other options, ok to use, know basic shortcuts, but not so familar with new jupyter environment.|
HOW SPARK (BASIC) TIME PLAN FAILED #
In mid-January, I planned to learn Spark basics and deploy it on standalone mode & mesos within 4 weeks.
Before this time plan, I have spent 72 hours for hardware & system installation.
Summary: 27 hours to learn & make auto installation. (a better automatic method PXE can be learned from a teacher)
17 hours to learn X & enable remote X.
10 hours to organize hardware, such as organizing pc cases, tables & cables.
|Big Data and Cloud Computing||72 hours|
|Spark > Meeting||01:34:00|
|Spark > Enable Router & Remote||10:56:44|
|Spark > Hardware||09:28:53|
|Spark > Integrate Matlab||01:40:29|
|Spark > Mess||01:31:20|
|Spark > Plan||02:03:40|
|Spark > System||11:21:47|
|Spark > System > Auto Install||12:52:49|
|Spark > System > Debug||01:07:23|
|Spark > System > LVM Partitions||00:28:25|
|Spark > System > Server Terminal||03:00:00|
|Spark > System > X||07:13:22|
|Spark > Vmware||04:28:53|
|Spark > Vmware > Matlab Headless||01:15:03|
|Spark > Vmware > SSH key problem||02:45:09|
original plan #
Week 3: install base systems (ubuntu).
Week 4: hello world (word count) in virtual machines. (+ yarn, I thought yarn is mandatory)
Week 5: mesos + bind (self-hosted DNS).
Week 6: deploy on real machines.
Week 3: 40 hours on Spark.
11 hours for hardware (dirty cables).
23 (+5?) hours for setting up virtual environment inc. file sharing.
6.5 hours for surfing info.
Sometimes lacking of efficiency (e.g. 6.5 hours surfing).
|Big Data and Cloud Computing||40 hours|
|Spark > DataBricks; Github Info and MOOC etc.||06:32:00|
|Spark > Dirty Cables||09:04:16|
|Spark > Hardware > Cables||02:05:02|
|Spark > New VirtualMachine > Remote Work @ Home||05:17:00|
|Spark > New VirtualMachine > VirtualBox||01:03:52|
|Spark > New VirtualMachine > Vmware Host-Guest Share Files||02:00:00|
|Spark > Puppet/Ansible||01:57:17|
|Spark > Share File||05:32:21|
|Spark > Windows for VirtualMachine||04:29:35|
Week 4: nothing on Spark.
Week 5: 25 hours on Spark [totally 50 hours].
7 hours on hardware;
15 hours to learn SaclePy coding (& book)
|Big Data and Cloud Computing||25 hours|
|Spark > Scale Py > Book Only (Accumulated Time)||03:00:00|
|Spark > Scale Py > Ch1||03:28:29|
|Spark > Scale Py > Ch2||02:01:14|
|Spark > Scale Py > Plan (Accumulated Time)||02:00:00|
|Spark > Hardware||07:04:18|
|Spark > Scale Py > Remote Jupyter||03:22:25|
|W > T’s Friend Temp Computer > Reset||03:48:52|
Week 6: nothing series on Spark. Tried 1 hour to active windows and failed.
Week 7: 10 hours to get the environment kind of ready.
|Big Data and Cloud Computing||10 hours|
|Spark > Scale Py > DL||01:00:00|
|Spark > Hardware > Dirty Cables||01:33:34|
|Spark > Software and Hardware > Network||03:37:00|
|Spark > Software > Network||01:48:10|
|Spark > Software > Puppet||01:39:05|
|Spark > Software > System||30:00 min|
Week 8: 7.5 hours to learn SaclePy coding & book.
|Big Data and Cloud Computing||7.5 hours|
|Spark > Scale Py > Ch2||06:30:14|
|Spark > Scale Py > Ch2 > Book Only||20:18 min|
|Spark > Scale Py > Ch9 > Book Only||22:18 min|
Week 9: 85 hours, and got Spark standalone running.
24 hours to learn SaclePy coding & book.
17 hours to set up Spark standalone.
10 hours to disable X to have more resource.
|Big Data and Cloud Computing||85 hours|
|Spark > Scale Py > Ch8||04:58:00|
|Spark > Scale Py > Ch8 > Book Only||01:47:00|
|Spark > 1st Performance Test > Prepare System||07:28:27|
|Spark > DNS||02:03:17|
|Spark > Hardware and Software > Network||01:13:56|
|Spark > Scale Py > Ch8||07:41:27|
|Spark > Scale Py > Ch9||06:03:05|
|Spark > Scale Py > Ch9 > Read Book and Debug of: convert features_header to list||02:14:07|
|Spark > Scale Py > Ch9 > Understanding||02:00:00|
|Spark > Self Evaluation, Review, Summary||59:43 min|
|Spark > Standalone in VirtualBox||07:51:24|
|Spark > System > Disable X||02:26:12|
|Spark > System > Disable X & Auto Install Spark||06:19:00|
|Spark > System > Disable X > Review Basic Linux||54:09 min|
other time was used for #
Week 3: 20 hours to write paper.
Week 4: 30 hours to revise paper. [totally 40 hours]. (Friday is Spring Festival).
Week 5: 10 hours to write paper, 10 hours TA. [totally 50 hours]. finally found the right book/material.
Week 6: 25 hours to write paper, 20 hours TA, [totally 45 hours]. 1 hour to active Spark windows (failed).
Week 7: 45 hours to write paper, 20 hours TA, 7 hours to welcome professor [totally 45 hours].
Week 8: 10 hours for ISP, 5 hours for department meeting, 17 hours to welcome professor 22 hours for QDA. [totally 60 hours].
Week 9: 5 hours to welcome professor, 5 hours to another city campus, 4 hours for QDA. [totally 73 hours].
analysis of problems (reasons for the delay) #
Totally 168 hours was spent until Spark standalone is running.
0. Did not estimate the time in the right way. For key tasks, only 8 hours is needed to setup standalone (virtual & real PCs), with 48 hours reading & coding Scale Py book. Learn: 48 hours + Final setup: 8 hours.
- 12.5 hours on dirty cables is not necessary. (at least, due to 7 hours for “hardware” is not clear and not included).
- 30 hours on automatic installation can be reduced, (at least half? = 15 hours). Thus, 168 hours becomes 140 hours (15% time saved).
- In addition, if I knew Matlab can run without screen, everything related with X can be discared (27 hours).
Therefore, 168 can be reduced to ≈120 and 1/3 of time can be saved in total.
- More detailed plan (in hours): (learn theory & coding, according to a practial book) + final setup; the same time for surrending work (app running env & sys); add 1/3 for extra unexpected wasting. Youtube video watching is not included.
- Ask for help and give the rights. 放权，不要以为所有事情都是自己做得最好，团队合作才能 1+1 » 2， 才能合理分配时间。
- Ask experts & teachers for help. It is not manadatory to learn everything, even if I want to learn details, experts usually provide better methods.
- It is kind of for sure that Matlab can run without screen, I should confirm it first and develop the system with some limitations (no screen) to cut unnecessary time. If the stakeholders do want it later, I can spend extra time to develop this feature.
- The paper reading time is for qiqqa reading & noting.
- Sentdex’s video is 3 hours, but 2.5 hours without server & ssl part.
- est. :+ estimated, when lacking of tomato log.
- the tomatoes in tables are shown standard tomatoes (1 tomato = 25min + 5min), if the actual recored time is not 25min, it will be translated to standarded values. l be translated to standarded values.