This study delves into the in-context learning (ICL) phenomenon of transformers, extending their adaptability to novel tasks without parameter updates. While transformers exhibit near Bayes-optimal in-context learning for simple function classes in previous research, our focus shifts to evaluating the potential of vision transformers in mastering intuitive physics in-context. Specifically, we train a vision transformer to predict the subsequent frame in a sequence of images depicting a bouncing ball, incorporating variations in gravity strength and ball elasticity. The assessment involves testing the model's generalization to unseen parameter combinations, contributing valuable insights into vision transformers' in-context learning capabilities within the realm of intuitive physics.