Natural language parsing has typically been done with small sets of discrete cate- gories such as NP and VP, but this representation does not capture the full syntactic nor semantic richness of linguistic phrases, and attempts to improve on this by lex- icalizing phrases only partly address the problem at the cost of huge feature spaces and sparseness. To address this, we introduce a recursive neural network architec- ture for jointly parsing natural language and learning vector space representations for variable-sized inputs. At the core of our architecture are context-aware re- cursive neural networks (CRNN). These networks can induce distributed feature representations for unseen phrases and provide syntactic information to accurately predict phrase structure trees. Most excitingly, the representation of each phrase also captures semantic information: For instance, the phrases “decline to com- ment” and “would not disclose the terms” are close by in the induced embedding space. Our current system achieves an unlabeled bracketing F-measure of 92.1% on the Wall Street Journal development dataset for sentences up to length 15.