Add option to use smaller AES tables (table sizes reduced by 6144 bytes)

This patch adds MBEDTLS_AES_SMALL_TABLES option to reduce number of AES
look-up tables and thus save 6 KiB of memory. Enabling this option
cause performance hit MBEDTLS_AES_SMALL_TABLES of ~7% on ARM and ~15%
on x86-64.

Benchmark on Cortex-A7 (armhf):

Before:
  AES-CBC-128              :      14394 Kb/s,          0 cycles/byte
  AES-CBC-192              :      12442 Kb/s,          0 cycles/byte
  AES-CBC-256              :      10958 Kb/s,          0 cycles/byte

After:
  AES-CBC-128              :      13342 Kb/s,          0 cycles/byte
  AES-CBC-192              :      11469 Kb/s,          0 cycles/byte
  AES-CBC-256              :      10058 Kb/s,          0 cycles/byte

Benchmark on Intel Core i5-4570 (x86_64, 3.2 Ghz, no turbo):

Before:
  AES-CBC-128              :     215759 Kb/s,         14 cycles/byte
  AES-CBC-192              :     190884 Kb/s,         16 cycles/byte
  AES-CBC-256              :     171536 Kb/s,         18 cycles/byte

After:
  AES-CBC-128              :     185108 Kb/s,         16 cycles/byte
  AES-CBC-192              :     162839 Kb/s,         19 cycles/byte
  AES-CBC-256              :     144700 Kb/s,         21 cycles/byte
diff --git a/include/mbedtls/config.h b/include/mbedtls/config.h
index c4b8995..44def95 100644
--- a/include/mbedtls/config.h
+++ b/include/mbedtls/config.h
@@ -388,6 +388,15 @@
 //#define MBEDTLS_AES_ROM_TABLES
 
 /**
+ * \def MBEDTLS_AES_SMALL_TABLES
+ *
+ * Use less ROM/RAM for the AES implementation (saves about 6144 bytes).
+ *
+ * Uncomment this macro to use less memory for AES.
+ */
+//#define MBEDTLS_AES_SMALL_TABLES
+
+/**
  * \def MBEDTLS_CAMELLIA_SMALL_MEMORY
  *
  * Use less ROM for the Camellia implementation (saves about 768 bytes).