We present Tangram, a programming system for writing performanceportable programs. The language enables programmers to write computation and composition codelets, supported by tuning knobs and primitives for expressing data parallelism and work decomposition. The compiler and runtime use a set of techniques such as hierarchical composition, coarsening, data placement, tuning, and runtime selection based on input characteristics and microprofiling. The resulting performance is competitive with optimized vendor libraries.