The increase in network traffic volumes challenges the scalability of security analysis tools. In this paper, we present NetLearn, a solution to identify potentially malicious network entities from large amounts of network traffic data. NetLearn applies recently developed natural language processing algorithms to discover securityrelevant relationships between the observed network entities, e.g., domain names and IP addresses, without requiring external sources of information for its analysis.