The massive growth and advancement of smart vehicles have led to the use of various applications such as autonomous driving and virtual reality in vehicular environments. Most of these applications require a great amount of storage and computation power, which generally are not available in smart vehicles. The vehicular edge computing technique is a promising solution that enables vehicles with limited processing capacities to run intelligent applications efficiently by offloading the tasks into roadside units such as base stations and access points, etc., or neighboring vehicles within their communication range. In this work, since vehicular environments are extremely dynamic and in order to make a steady computation offloading decision under uncertainties of the environment, we present an online learning procedure based on Q-Learning, which allows vehicles to learn the offloading delay performance by interacting with the environment. Simulation results illustrate that the proposed algorithm is able to obtain better performance compared with the existing upper confidence bound algorithms in terms of average offloading delay.