With only observational data on two variables, and without other assumptions, it is not possible to infer which one causes the other. Much of the causal literature has focused on guaranteeing identifiability of causal direction, in statistical models for datasets where strong assumptions hold, such as additive noise or restrictions on parameter count. These methods are then subsequently tested on realistic datasets, most of which violate the assumptions and can therefore not be fit properly. We show how to use causal assumptions within the Bayesian framework. This allows us to specify a model that does not artificially restrict the datasets it can fit, while also encoding independent causal mechanisms, leading to an asymmetry between the causal directions. Identifying causal direction then becomes a Bayesian model selection problem. The strong flexibility does imply that some ambiguous datasets exist for which causality cannot be identified. To demonstrate our approach, we construct a Bayesian non-parametric model that can flexibly model the joint. While making few choices in constructing our model, we outperform previous methods on a wide range of benchmark datasets.