High-performance and domain specific embedded architectures, composed of microprocessors, memories, and a number of dedicated coprocessors are very hard to program. On these architectures, applications will be executed that belong to the domains of (real-time) multi-media processing, mobile communication, encryption, or adaptive array processing. On the architectures, instruction level parallelism can be used effectively but not task-level parallelism. To exploit task-level parallelism, we use the Kahn Process Network model of computation. In the presentation, we will explain the Kahn Process Network model in more detail and indicate why this model is interesting for System Level Design of stream-oriented applications. We present our design methodology based on the Compaan/Laura compilers and present a case in which we convert an M-JPEG application into a Kahn Process Network description and map some of the processes on a CPU and other processes on a FPGA.