Monday, May 26, 2014

Kids Are Icebound by ‘Frozen’ Fervor - New York Times

Kids Are Icebound by ‘Frozen’ Fervor Disney’s Animated Film ‘Frozen’ Has Some Children Obsessed By JOANNA COHENMAY 16, 2014

Like a very large portion of the population with children of preschool age, we were caught up in the Frozen craze.  The movie came and went without much notice, but by the time that the singalong version came out in the theater, T's entire daycare was singing it.  He had even started to learn the words to 'Let it Go', and we felt compelled to find the video so he could at least learn it properly. We saw multiple videos of it, T somehow finds everything about Frozen, even a video series of styling hair according to  the main characters.  And we put it in our Netflix queue so that when it came out we got it and watched it a few times.  But with all that saturation, I still think it is a very good movie.

1.  The Let It Go video is a great level up sequence. In addition to the song being an anthem to independence, the music video is a great sequence of a character growing in capability. The character of Elsa is shown creating ice and snow figures in increasing complexity as the song progresses, which is how people learn skills in real life.

2.  The characterization is meant for good role playing. I found myself thinking of the characters as Fate core, Fate Acceleated, or Risus characters, with strengths, weaknesses, troubles, character traits that get compelled and applied in many ways.  And the way the characters act is basically a superhero team-up story line, with the main characters learning how to work together as they go.

3. The characters have strengths, weaknesses, and character traits that help and hinder them (a plainer way of stating 2).  The things that makes each character strong and compelling are also the things that gets them into trouble. No Mary Sue or damsel in distress characters. Even the bad guys have strong characterizations (which is why it is hard to recognize the bad guy)

4. Even the ditzy one has strength of character, to include strength in the physical world, not just amorphous inner strength.

5. A female lead has as part of her characterization "likes math" (specifically geometry).  Math, physics and chemistry matter in this world.

6.  Relationships are not just about romance. The tendency of people to think that relationships are only about romanced is addressed by the problems it causes and the fact that the key plot point is that the characters make this assumption and forget that the prophecy could be referring to other types of love.

Now, we still think that other sources of songs such as Sound of Music and Les Misarables are more welcome than Frozen, but for a film that is a source of influence, Frozen is not all that bad.  And, to answer the question in The Motherlode (NYTimes) article on the subject (Parents respond to the 'Frozen' frenzy), we are of the opinion that the Idina Menzel version of 'Let It Go' over the Demi Lovato version.

Monday, May 19, 2014

A week of siblings: Parenting

Holding little sister

As a mid-month update, our big parenting news is that instead of having an experiment of one, we have an experiment of a pair. We have of course been using books to get T used to the idea that mei-mei (little sister) was coming, and reading books with him about being a big brother, and letting him feel mei-mei in mommy's belly, and helping in the preparations for mei-mei's arrival.
I have to set up the sleeper I made for mei-mei

But with all of this we have always wondered if he really understood it (other than those times where he clearly figured out that having a little sister would mean that there is less for him). Well, now she is here. At this point it is a lot of excitement to have something new and small in the house. One of the most vivid memory he has of mei-mei first night back was of her pooping (for the first couple days, newborn poop is black meconium, and she was pooping during a diaper change so he got to see it.)
 Reading a bedtime story for baby sister

There is definitely an element of excitement. And there are things he can do, like read books for mei-mei. And he likes to take pictures of mei-mei.

Photo shoot in progress

And pictures with his new little sister.
We're having a photo shoot of my baby sister

A recital for baby sister

He has figured out that he now has competition for that scarce resource called mother's time. So we notice that the house is a little louder now with him talking very loudly, making sure he has people's attention. We have him sleeping with daddy every night now (he has never been a good sleeper) and there were a couple nights where he wanted to go with mommy, but that has ended (with the realization that it is very easy to bargain daddy up to five books at bed time instead of only two or three) But, at the one week point, big brother views little sister as a little doll that sometimes makes a bit of noise, and he remains very happy, if very talkative and loud. But mei-mei does not seem to mind either. So far so good.

I love my baby sister

Sunday, May 18, 2014

The Human: War of the Seasons Book #1 by Janine Spendlove

The Human (War of the Seasons, #1)The Human by Janine K. Spendlove
My rating: 4 of 5 stars

The Human is about a 17 year old named Story, who has lost her family and has been spending some time driving away her friends. While spelunking with her remaining friend she goes through a portal and comes out into a world where the once immortal race of elves is slowly dying away and as a human she holds the key to ending their curse.

The story aside, the book is about a lonely and somewhat socio-phobic young adult thrust into a fantastic setting. And while physically competent as a young adult (she is a spelunker, and good with a knife and bow), the people she meets in this new world are wise to the ways of the world, cunning and remorseless. And her wits are not helpful and her people skills are not much better in this setting. So The Human is her learning how to deal with others instead of being the outcast teenager she was. She learns (mostly by mistakes) about reading people, of considering the motivations of others, of having trust betrayed, and trust rewarded. She has relationships that are purely manipulative, some that are genuine care, some that are sacrificial, and even one that is without any agenda. And as she goes from an isolated teenager into a young adult with a range of relationships, her view of the world and her place in it grows.

There is a romance in the book with the female protagonist. But I appreciate the fact that she had a chance to become competent as a person (both in skills and how she related towards others) and other aspects of her growth were more important than the relationship for most of the book. And that this growth is what made her attractive to the other person in the end.

This was an engrossing story. I liked following the main character's growth as a person, and how it fits in with the overall story. I'm looking forward to the other books in this series.

View all my reviews

Friday, May 16, 2014

Notes from teaching data science for the first time

Drew Conway Data Science Venn Diagam
I spent this past semester teaching a course in data science. While there has been a data mining course taught in the department, it is offered irregularly and had a different focus.  The premise for the course I taught was that data science was the intersection of data hacking, mathematical and statistical methods, and domain knowledge (with props to Drew Conway). The students I had generally had little to no programming experience (or meaningless background). All have had a first course in statistics.

I used two texts. First was Stanton Introduction to Data Science, which is used in the Syracuse Data Science certificate program.  Second was Introduction to Data Mining with R by Luis Turgo.  All of the students were also told to go through Introduction to R prior to the beginning of the course (or as early as possible).

The class started off going through Introduction to Data Science, which included a few introductory chapters to data analysis, and introduction to R and the R Studio IDE.  Then were chapters on some basic methods at the basic level such as text processing, review of regression.  Then additional methods such as association rules and support vector machines.  We then switched to Data Mining with R which were a series of case studies.  Each case study had some form of data munging (manipulation) required, with the first one having an involved demonstration of how to handle missing values, either determining the correct value or removing as appropriate.  Each case also had a lengthy discussion of the methodologies used, with what each is being used for and a basic understanding of how it worked and its implementation using libraries in use with R (there is a book package, but it has mostly data sets and some functions to assist in data manipulation and visualization.

The assignments were built around individual projects. Their were three presentations, exploratory data analysis, preliminary data analysis, then final. The first two they could work together, but the final one had to be solo as they needed to have individual topics (even if they used the same data sets). The intent was that these assignments would build towards a final goal (but they had flexibility to bail if they wanted to mid-semester.

Probably 1/4 of the students found projects off of Kaggle, which is useful because it has a nice complex data set and comes with a legitimate question.  Another 1/4 of the students used public health as a motivating area (University of Pittsburgh is home to Project Tycho, which is a rich dataset of infectious disease in the U.S., also, there is a joint program with the Department of Industrial Engineering and the School of Public Health).

Some problems that came up. First, I discovered that many of the students had an operating assumption that all data was normally distributed, and they constantly made claims that their data was normal. Even when the data was noticeably skewed.  This was embarrassing when they would make statistical tests, and the test graphic would include the corresponding normal approximation which was nowhere near the data. I eventually figured out that for many of them, when they took statistics they were constantly fitting normal distributions in their homework data sets, so I explained that their textbook problems were written so that their would be a normal distribution to find.

Another problem was the lack of a hypothesis.  Many students started to pick problems that could be solved through linear regression and declared that because it met a p-value criteria they were done. (and in some cases, I recognized the data set as being a teaching data set). But even though they could fit a regression, there was no theory on why the data related in a given way. Essentially, they were pushing data through an algorithm without any subject understanding.  Most (not all) of them got the ideas by the end of the second presentation.

A third difficulty was skipping the model evaluation.  Most of the methods covered have some parameter that was the analysts choice, so they should have explained how they chose the value of that parameter.  Generally, this should have been a discussion of making the tradeoff between closely approximating the observed data and overfitting.  Some students skipped this completely (essentially, this is what would happen if you fed data to an algorithm then reported the result using all default values)

One big observation I had by the first presentation was being able to identify the level of programming ability by the choice of projects.  I strongly suspect that a number of students were minimizing the programming required.  But that became reflected in the level of ambition of the projects.  Non-programmers tended to choose simplistic data sets with little variety.  I think the difference is the workload.  People who could program were able to slice the data available on a multitude of dimensions without regard to scale, since the computer would do all the repetitious work, while those who could not program generally were reluctant to have large sample populations or multiple data sets on the same population.

Things for next time.  First, impress on them the need to learn to program.  Essentially, the projects from those who could program were so much richer than those who could not (even at a low level of programming skill) that I was embarrassed for those who could not program.  Second, I should push harder on the need to have a hypothesis that was driven by domain understanding of the problem. This should be pushed harder from the very beginning to discourage people from merely pushing data through statistical methods and reporting results.

For teaching data mining, I think that the organization of the course needs more methods focus. The principle text was case driven, but that meant that methods were being introduced in a fairly arbitrary sequence.  I ended up doing a methodology focused review over the last few weeks. What I should do next times is after the introductory section (Stanton Introduction to Data Science), have the next several lectures be a tour of the classes of data mining methods (regression, classification, clustering, feature selection), then do the case studies.  One resource I found useful in this are articles from the Journal of Statistical Software, many of which are focused on R packages that implement classes of methods.

This was a very good course. I wished that the students did more participation (by the final presentation, there were some points that were given based on shear quantity of comments, which several students took advantage of). Some of the projects were much more ambitious than any other done in the MS program. And I have a lot stronger argument about the need for the graduate students to know scientific programming as a skill set.

Sunday, May 11, 2014

Data Mining with R by Luis Togo: Book review

Data Mining With R: Learning By Case StudiesData Mining With R: Learning By Case Studies by Luís Torgo
My rating: 3 of 5 stars

Data Mining With R (DMwR) promotes itself as a book hat introduces readers to R as a tool for data mining. It teaches this through a set of five case studies, where each starts with data munging/manipulation, then introduces several data mining methods to apply to the problem, and a section on model evaluation and selection. It fills a place in the literature since it devotes a lot of space for data manipulation before applying the various methods and model evaluation afterwards. But it is hard for people learning data mining since it spreads the types of model throughout the book.

I used this as one of two texts to teach data science to people whose programming and data analysis skills were generally at a very low level. The big advantage of using a programming environment such as R for data mining is the fact that you can do data manipulation in the language, then apply the methods. Many of my students have taken machine learning elsewhere, but they always used prepared data sets, so this emphasis on data manipulation with several very disparate data sets is a unique feature.

The second big advantage of this book is the focus on model selection. For each chapter, the book goes through the exercise of determining which model should be used, and how to diagnose the model to determine which one is appropriate and best for the problem. I especially appreciate the fact that in some cases, the conclusion of the book after model evaluation is that the method did not work for the problem and question at hand. Because most textbooks focus on demonstrating that you did find something, in some cases my students get confused when in real problems they did not find an effect.

Where the book is lacking is the fact that the methods are scattered across the case studies with minimal organization. While this is a result of the realities of the cases, the book would have benefited from a roadmap chapter or introduction that gave methodological context (i.e. what methodologies are used in the book and where they are). This lack made it very difficult to use as a textbook, and by the time I was done using it I was essentially building the roadmap to use the book. This makes it not useful as a standalone textbook for such a course, but very good if there is another text that gives the overview of the methodologies.

View all my reviews

Thursday, May 08, 2014

Making a folding worktable

First project of the summer was a folding worktable. The workbench I made last year is nice, but it is currently in use as a table sitting in the landing. In principle, it can be moved to the garage or the back patio as needed. In practice, it is hard to see that happening without a major cleanup effort. So I wanted a larger worktable that was still portable. The Black & Decker Workmate is too light for the task, and the sawbench I have is too small and low (but it is good for variety, and I still will need somewhere to sit)

I am using the worktable design that is described in a series of YouTube videos by "LJ Magnum" (video is at bottom of the post). The structure is made up of 2x4's, so it is rugged and stable. The top is a pair 1x10 boards, so the top is fairly tough as well. The folding part is provided by 1" dowels, so it is stronger than it would be if I used hinges.

First the wood. Richard Arendt posted plans based on these videos at I needed two 8' 2x4's, one 8' 1x10, one 36"x1" dowel, Cut list is at the posted plans.

Next was cutting the legs and supports. 22.5 and 45 degree angles were needed so that the table will be stable when opened. I cut these using a jigsaw. A jigsaw going through 2x4s is pretty rough, so I went back over them with a oscillating multi-tool and a file and sandpaper afterwords. What I really need to do this is a circular saw.

Lumber and boards for a folding workbench
Lumber cut by Home Depot

To drill in holes in the legs and tabletop supports I used a cordless drill and a drill guide. 1" holes through 2x4s was a pretty heavy burden on an 18V cordless drill. I had to use both batteries. If I have to do this again, a corded drill may be useful. It was the first time I used a drill guide, which worked pretty well. One thing I learned is that when drilling a hole, it always makes a mess at the end. So I learned to detect when the tip of the drill bit broke through, then flipped it around and finished it from the other side. I used a chisel to cut away the mess (which was now in the middle instead of the service of the 2x4) and I used a round file to smooth out the hole (and made sure my 1" dowel would go through).

The Wolfcraft Drill Guide
Wolfcraft drill guide and cordless drill

I used the plans to assemble the legs and top supports. A long dowel when across the top with the inner support and the inner pair of legs. On each side, the two legs and the outer leg and outer support were each connected with a short piece of dowel. For each dowel on each side, one piece was screwed in, the other piece was free to rotate around the dowel.

Legs and supports for a folding workbench
Structural components of worktable

Finally, I put the top on. First, for each of the two 1x10 boards, I picked a side to be on top (the one with fewer pits around knots). Then I picked the sides that best fit each other, using a surefile to make the fit a bit better. Then I drove screws to attach them to the supports (and messed up which support went with which top along the way, so I had to redo one side)

Folding workbench folded up
Folded worktable top and front braces

The result is a folding worktable. It is pretty stable due to the mass of the 2x4s that make up the legs and top supports and the use of the 1" dowels as joints, more than a commercial folding worktable with metal legs and bolts for joints would be. And it is fairly large. The fact that it is about standard table height also means that it can be used as a table for other things, like a folding table for eating that is much more stable than the norm.

The finished folding workbench
Completed worktable

Folded up folding workbench
Folded up worktable standing

I'm pretty happy with this one. Folds up nicely so I can put it out of the way, and stronger than most comparable folding tables. And I learned a few things while making it. A successful project.

Monday, May 05, 2014

Agile Data Science by Russell Jurney: Book review

Agile Data Science: Building Data Analytics Applications with HadoopAgile Data Science: Building Data Analytics Applications with Hadoop by Russell Jurney
My rating: 4 of 5 stars

One of the problems with data science is that any description of what is encountered takes on the appearance of a mythical unicorn, noone person could possibly have all of the skills required. And it gets worse when you add to the standard set of statistics, domain knowledge, and programming the ability to deploy the application into a high speed environment. This book is not going to make a data scientist an expert in running a data center, but it is useful to give someone who has the rest of the skills an understanding of the environment their work will be deployed into.

One of the conflicts between the data scientist/analyst and information technology groups is that while the data scientist gives the data owned by the organization its value, IT is charged with storing the data and providing the access. And in a high velocity, high volume environment of big data, not understanding how the architecture works can lead to the data scientist creating valid solutions that cannot be applied in the actual day to day working environment. That is where this book comes in. The book has associated virtual machines in software repository so that the data scientist who does not know anything about infrastructure and the software stack that the data and the analysis rides on can see how everything fits together.

The book title is misleading. This is not a book about data analytics. This is a book for data analysts so they know how their analytical application is deployed and applied to day-to-day use in enterprise environments. For that reason it is useful.

View all my reviews
Disclaimer: I received a free electronic copy of Agile Data Science as part of the Oreilly Press Blogger program.
I review for the O'Reilly Reader Review Program

Friday, May 02, 2014

Parenting Month 42: Gender roles for preschoolers

I need to turn this to get the eye in the project
Making a birdbath at Home Depot
Earlier this month, a girl at church around the same age as T told him that he could not sing songs from Frozen because he is a boy.  Fortunately, his friends at daycare don't care so much about who is singing what.

When T was born, we did not care if T was going to be a boy or girl. When we were given clothes we did not care, figuring a baby can wear lots of pink stuff and not care (we do draw the line at wearing dresses though).  We always figured that kids develop gender roles on their own time, so we did not feel the need to hurry it along. And in the meantime, let him develop a range of interests.

Planting parsley and oregano
But the reason for being deliberate about this is not because of the pre-school years, where things like dresses are merely cute. The issue comes when this is a pattern, when this becomes statements on what someone can and cannot do.  I was talking with an undergraduate whose senior research project I was mentoring about the need to not lose talent because of preconceived notions of what people could do.  Previously I've had those conversations with students about why diversity is important, to increase the pool of talent that can be drawn from (of course, if someone is doing something where there is no problem to find a mindless, unskilled workforce, I can see why diversity is not an issue).

What is the alternative?  When it can become acceptable to viewing entire categories of people as without value and as things unworthy of being considered of worth. When I was in grad school a friend commented that it was of no use to make friends with christian females because they would just get married and end any friendships.  Another friend viewed it as a profound act of disloyalty for believing that his girlfriends life was of value and responding to her calls for help in the backcountry. A pastor declared that a girl I was seeing as well as any other friend of mine were social freaks, while trying to convince me that I was romantically interested in a girl whom I never made it to the point where I could recognizing her name.

The alternative to rejecting gender roles, as presented in this middle-to-upper class American society is to claim that they have nothing to contribute to the wider world.  That there is no need to recognize that a person has views, ideas, thoughts because of her gender.  That there is no value in recognizing a person other than visually, because there is nothing there to learn.

Are there people who want it that way? Certainly. It was certainly promoted by some of my fellow graduate students.  I have heard it in churches, both Chicago and Pittsburgh.  And if there are people who want to be known only as a possession or as something to look at, it is their right.

But that is not the world that I live in. My world is one where the rarity of talent drives its economy. My professional world is one where I've been taught to think about developing talent over years, and things like childrearing are only a small part of that. And it certainly is not useful to think of entire classes of people as freaks or merely eye-candy without individual evaluation.

We had made a choice not to push gender roles before they are needed on our children. It means that they have choices on their interests, hobbies, and careers. And we hope that it means that they will surround themselves with people who they evaluate based on competence and depth of character and insight, not the surface actions of people who do not wish to be anything more than a figure to be viewed from a distance.